Why the Simplest Cradle-to-Career Data System May be the Best Place to Start


A long awaited window is opening. Governor Gavin Newsom’s first budget proposes $10 million to develop a longitudinal education data system, meaning soon, California might actually be able to answer questions like:

  • Which high school graduates from which schools successfully transition to college? Which of those successfully earn postsecondary certificates and degrees?
  • What are the workforce outcomes of high school graduates who do not go directly to college?
  • How successful, by major and degree/credential, are graduates from our postsecondary institutions when they reach the workforce?
  • How do all of these outcomes vary by students’ race/ethnicity, income level, region of the state, and other important factors?

Answers to these questions (and many others) could help policymakers and educators figure out how to better support students in meeting their educational goals and, by extension, the state’s workforce needs. A statewide data system could also help students and families make informed decisions about college and career. But the people who will decide whether and how to create a statewide data system face some critical choices—namely, what purpose would such a data system serve? Who would use it and which questions would it be designed to answer?

The Governor’s vision, articulated in his proposed budget, is that a longitudinal data system would, “better track student outcomes and increase the alignment of our educational system to the state’s workforce needs.” Now that the new Administration has removed the unspoken moratorium on expanding the state-run collection of education data that existed under Jerry Brown, interest is growing among legislators, and they are seeking advice on how best to proceed. Other research and advocacy organizations are weighing in on the benefits to be gained in developing a statewide data system. At EdInsights, we spent two years looking at the existing education data infrastructure in California and recommended a set of policy criteria for consideration in any new statewide data system.

In this context, one issue that needs clarification is that there are different models for a data system under consideration—the different models are based on different technological processes and they provide different information, which then support different purposes:

  1. A traditional Statewide Longitudinal Data System (SLDS), or “P-20W” (preschool through workforce) data system, connects student records across education systems and to state workforce data. The data are updated after each school term, are made accessible to different audiences through reporting and analysis tools targeted to them, and are used for research to inform education policy and practice to achieve better and more equitable outcomes for all students.
  1. A frequent upload system allows educators in high schools and postsecondary institutions to access more up-to-date information, based on frequent (such as monthly or weekly) uploads of data to the system, which can be used to provide services to individual students as they progress from one institution to another across the educational continuum.
  1. A real-time data system involves connecting the student information systems at individual institutions through a technological interface, which enables those institutions to access “live” data in each other’s systems without needing to upload the data to any central repository.

While some people refer to a frequent upload model as a “real-time” data system, an actual real-time system can only be created by the education institutions and systems themselves, as it involves connecting their individual data systems rather than creating any separate, combined data system. Doing this poses significant cost and technological barriers related to the many different data platforms in use across California’s schools and colleges, particularly within the K-12 sector.

The current conversations around creating a statewide data system in California involve the first two models, a traditional SLDS or a frequent upload model. It is critically important to note that the two types of data systems are not mutually exclusive, and neither is a substitute for the other. Both play important roles. In considering a path forward for developing a statewide data system in the near term, it is important for policymakers to understand the different benefits and limitations of both kinds of systems. A critical issue for the state is: which system should be the focus of this initial investment as the way to start making sense of all the information sitting within our public education systems?

Creating an SLDS in California would involve matching student records across the existing data systems at the California Department of Education and the systemwide offices of the California Community Colleges, California State University, and University of California, with benefits that include:

  • leveraging the state’s considerable investment in information that is already collected but now used primarily for compliance;
  • relying on technical processes that are well understood and well-tested;
  • requiring a fairly short timeline to develop, at reasonable cost, perhaps even within the Governor’s proposed $10 million investment;
  • containing information on all students enrolled in the public education systems, regardless of their educational pathway;
  • limiting data security concerns, as students’ names and other identifying information can be removed after matching records;
  • making information accessible to multiple education stakeholders, including students/families, educators, policymakers, and the public, increasing transparency and informing decisions;
  • enabling new data to be included as additional use cases arise in the future; and
  • offering the opportunity for expansion in the near term to include data from early learning, adult education, workforce training, private postsecondary institutions, health, social services, and corrections, increasing the issues of policy and practice that could be addressed.

The significant limitation of an SLDS model is that it does not provide frequent (weekly or monthly) uploads of information that can be used to support the transition of individual students across education systems.

A “frequent upload” system could help educators support students in the moment, with, for example, course changes aligned to students’ pathway interests. There are very promising regular upload models happening in California. For example, Sacramento City Unified School District is working to share such data from its high schools with regional higher education partners to monitor the college application and enrollment processes of its seniors as they approach graduation, so school officials can intervene when necessary to ensure students’ successful transition to college. The California College Guidance Initiative works with 54 school districts in a number of regions across the state to ensure that students’ high school transcripts, along with information about their college and career goals, are submitted electronically to colleges and financial aid providers. Such efforts provide important benefits to postsecondary institutions and students.

However, there are significant concerns when considering such data systems as the model for the initial development of a statewide data system. Frequent upload systems require near constant data submissions by school districts that pose a challenge to their limited resources. Varying data platforms, data structures, and quality control processes pose technical barriers to compiling student records across individual institutions. There are greater data security concerns related to the maintenance, and sharing, of identifying information about students across the many institutions and individuals who would, by design, have access to such a data system. Overcoming the barriers to building out such a system statewide would pose high costs and an extensive timeframe, meaning that comprehensive, statewide, data would not be usable for years.

While continuing to support the development and use of frequent upload data systems, the state should move forward to connect the data systems we already have to create an SLDS, and use it to understand and then effectively improve opportunities and outcomes for all California students. While one criticism of a traditional SLDS is that it would only benefit researchers, the uses of such systems in other states demonstrate their potential value for students and their families, for educators, and for the policymakers tasked with ensuring that the state’s investments are used most effectively to support all students:

For decades, California has been sitting on education data with no publicly available information on equitable opportunities and outcomes related to its investments in public education. The state is now well-positioned to join many other states in using existing student information to create an SLDS that can inform student decision-making, support schools and colleges in their efforts to improve student progress and outcomes, and meet the Governor’s vision of using data to improve and align state education and workforce policies to better achieve the educational goals of the state and its students. It’s time to seize that opportunity.