The 2008 College Affordability Act prohibited the U.S. Department of Education from establishing a student unit record (SUR) database. The database would contain individual student information of any student who attended a higher education institution eligible to receive federal funds (i.e., almost all).
The bill killed those hopes of a national SUR for awhile but several organizations have successfully advocated establishing state databases containing individual student data in higher education. This has left a few unhappy and it appears the Obama administration is still attempting to ressurect a national SUR database. Opponents of a national system argue the privacy concerns outweigh the benefit of one, centralized system. I don’t think a national SUR is needed, but not because of privacy issues, but on several points that haven’t been mentioned elsewhere.
It seems many people are advocating for a national SUR because it is the only way to lead to a “true” answer. That is, they think the only way to measure educational outcomes is to measure it for every single student in the United States. This simply isn’t true. Introductory statistics successfully illustrates that you only need a sample of a population to derive estimates of the entire population. And while larger samples let you estimate the population with a higher degree of confidence, even collecting responses from the entire population may be prone to error.
The U.S. Department of Education and the National Science Foundation have implement several surveys that follow students throughout education and even into the workforce. The National Longitudinal Survey of Youth (NLSY), National Education Longitudinal Study (NELS), Beginning Postsecondary Students (BPS), Baccalaureate and Beyond (B&B) and other data sets have used samples to follow a relatively small segment of students. Using proper statistical measures, these results can be extrapolated to the entire population. And while these students may lie to a surveyor, the same is possible on a national SUR, especially if this data is eventually tied to funding. Nevertheless, many of these longitudinal surveys ask to see documented records (e.g., transcripts) to ensure accurate data reporting.
The crux of higher education is not a lack of data, but methodological issues. Educational research is plagued by self-selection (also called selection-bias). Prestigious programs tend to enroll highly qualified students, which makes the program look fantastic when all they did was select all the ‘A’ students. Troubled program and schools tend to attract the troubled students with difficult socioeconomic backgrounds. Thus, poorly designed studies may incorrectly show positive impacts of an optional cirriculum or program. However, these tend to fail when implemented on a larger scale and troubled students–who originally avoided the program–enroll.
A national SUR database wouldn’t ameliorate self-selection issues. The only way to find if education is truly effective is to use either experimental or quasi-experimental evaluations. For instance, a pilot preschool program in the 1970s called Perry Preschool had more applicants than seats. Consequently, a random set of students were chosen to enroll and the rest did not. By the time these kids turned 40, the Perry Preschool participants earned higher wages and were less likely to be incarcerated. Such studies, which are definitive and convincing, could not be utilized in a national SUR.
Researchers also need a robust set of variables. While a national SUR will contain a lot of variables, it won’t be a robust set. Namely, researchers already know home factors influence a child’s education, such as parental income, parental education, and out-of-school activities. There variables are very difficult to incorporate in an administrative data set, but can be captured through survey-based datasets.
A national SUR isn’t the solution to finding answers in higher education. The privacy concerns are considerable, yes, but no more disconcerting than state longitudinal databases that exist now. Proper methodology needs to be reinforced more than establishing a large central database.