Tuesday, March 18, 2014

Big Data versus the SAT

In a recent Time Magazine article, the president of Bard College, Leon Botstein, joined a chorus of criticism of the SAT, going so far as to call it “part hoax and part fraud.”  Criticism is coming fast and furious because the SAT has just unveiled a new, improved product to try to fend off a trend among competitive colleges to downplay the role of the SAT, and even to eliminate its use entirely.  (The not-for-profit status of the College Board, which produces the SAT, does not put the company beyond reacting to a profit motive; not-for-profit does not exactly mean you don’t get benefits from pulling in more revenue).

Among the critiques are that performance during one afternoon in the junior year of high school should not govern a student’s future (the SAT is not only used for college admission, but also by employers further down the road); that the SAT has little to do with what really constitutes learning and productivity (I haven’t done anything useful in my work by pulling the right answer off of a multiple choice test for as long as I can remember); that it correlates with income more than it does with anything else; and that what has now become years of SAT preparation detracts from more productive learning.

And that is just the start of the list. By its very nature as a standardized test, to the extent colleges rely on the SAT they are looking at exactly the same criteria, so the same sort of students will percolate to the top of the stack. The creative and off-the-wall student who adds the richness and intellectual diversity that a college seeks might be blown off course because there is no reason to think that such a student will do well in a timed, multiple choice test – or even that such a student will have an interest in the many hours of preparation courses required to be competitive in the test. To take the effect of the SAT exam on creativity to the extreme, read a previous blog post I wrote on the effect standardized testing has had on creativity in Asian countries.

You would think that in the emerging world of big data, where Amazon has gone from recommending books to predicting what your next purchase will be, we should be able to find ways to predict how well a student will do in college, and more than that, predict the colleges where he will thrive and reach his potential.  Colleges have a rich database at their disposal: high school transcripts, socio-economic data such as household income and family educational background, recommendations and the extra-curricular activities of every applicant, and data on performance ex post for those who have attended. For many universities, this is a database that encompasses hundreds of thousands of students.

There are differences from one high school to the next, and the sample a college has from any one high school might be sparse, but high schools and school districts can augment the data with further detail, so that the database can extend beyond those who have applied. And the data available to the colleges can be expanded by orders of magnitude if students agree to share their admission data and their college performance on an anonymized basis. There already are common applications forms used by many schools, so as far as admission data goes, this requires little more than adding an agreement in the college applications to share data; the sort of agreement we already make with Facebook or Google.

The end result, achievable in a few years, is a vast database of high school performance, drilling down to the specific high school, coupled with the colleges where each student applied, was accepted and attended, along with subsequent college performance. Of course, the nature of big data is that it is data, so students are still converted into numerical representations.  But these will cover many dimensions, and those dimensions will better reflect what the students actually do. Each college can approach and analyze the data differently to focus on what they care about.  It is the end of the SAT version of standardization. Colleges can still follow up with interviews, campus tours, and reviews of musical performances, articles, videos of sports, and the like.  But they will have a much better filter in place as they do so.