Tuesday, March 18, 2014

Big Data versus the SAT


In a recent Time Magazine article, the president of Bard College, Leon Botstein, joined a chorus of criticism of the SAT, going so far as to call it “part hoax and part fraud.”  Criticism is coming fast and furious because the SAT has just unveiled a new, improved product to try to fend off a trend among competitive colleges to downplay the role of the SAT, and even to eliminate its use entirely.  (The not-for-profit status of the College Board, which produces the SAT, does not put the company beyond reacting to a profit motive; not-for-profit does not exactly mean you don’t get benefits from pulling in more revenue).

Among the critiques are that performance during one afternoon in the junior year of high school should not govern a student’s future (the SAT is not only used for college admission, but also by employers further down the road); that the SAT has little to do with what really constitutes learning and productivity (I haven’t done anything useful in my work by pulling the right answer off of a multiple choice test for as long as I can remember); that it correlates with income more than it does with anything else; and that what has now become years of SAT preparation detracts from more productive learning.

And that is just the start of the list. By its very nature as a standardized test, to the extent colleges rely on the SAT they are looking at exactly the same criteria, so the same sort of students will percolate to the top of the stack. The creative and off-the-wall student who adds the richness and intellectual diversity that a college seeks might be blown off course because there is no reason to think that such a student will do well in a timed, multiple choice test – or even that such a student will have an interest in the many hours of preparation courses required to be competitive in the test. To take the effect of the SAT exam on creativity to the extreme, read a previous blog post I wrote on the effect standardized testing has had on creativity in Asian countries.

You would think that in the emerging world of big data, where Amazon has gone from recommending books to predicting what your next purchase will be, we should be able to find ways to predict how well a student will do in college, and more than that, predict the colleges where he will thrive and reach his potential.  Colleges have a rich database at their disposal: high school transcripts, socio-economic data such as household income and family educational background, recommendations and the extra-curricular activities of every applicant, and data on performance ex post for those who have attended. For many universities, this is a database that encompasses hundreds of thousands of students.

There are differences from one high school to the next, and the sample a college has from any one high school might be sparse, but high schools and school districts can augment the data with further detail, so that the database can extend beyond those who have applied. And the data available to the colleges can be expanded by orders of magnitude if students agree to share their admission data and their college performance on an anonymized basis. There already are common applications forms used by many schools, so as far as admission data goes, this requires little more than adding an agreement in the college applications to share data; the sort of agreement we already make with Facebook or Google.


The end result, achievable in a few years, is a vast database of high school performance, drilling down to the specific high school, coupled with the colleges where each student applied, was accepted and attended, along with subsequent college performance. Of course, the nature of big data is that it is data, so students are still converted into numerical representations.  But these will cover many dimensions, and those dimensions will better reflect what the students actually do. Each college can approach and analyze the data differently to focus on what they care about.  It is the end of the SAT version of standardization. Colleges can still follow up with interviews, campus tours, and reviews of musical performances, articles, videos of sports, and the like.  But they will have a much better filter in place as they do so.

8 comments:

  1. I suspect that the bottom line is that many colleges are most interested in maximizing the amount of alumni donations, and all else follows from that. I'm curious whether their models for achieving that will be affected by big data.

    ReplyDelete
  2. Dear Professor Bookstaber, would you agee to the same for the GMAT?

    ReplyDelete
    Replies
    1. The more vocational the test, the less of an issue it is. For example, I have had to take various FINRA tests when I had broker-dealer responsibilities, and would not think of these SAT-related issues applying there. The GMAT is not as vocational as that, but it is more so than the SAT.

      Delete
  3. "Colleges have a rich database at their disposal: high school transcripts, socio-economic data such as household income and family educational background, recommendations and the extra-curricular activities of every applicant, and data on performance ex post for those who have attended."

    Transcripts are paper copies or pdf's. They are in completely different format from school to school, and often vary from year to year. It would be an unbelievably huge effort to turn transcripts into mineable data.

    ReplyDelete
    Replies
    1. You could imagine a program that would do this relatively easily. In fact, they already exist.

      Delete
    2. Doing this broadly on a legacy basis would be difficult. I wonder, though, if there are schools that have moved beyond pdfs and so whether this could be done on a school -specific basis. In any case, if an initiative were started now, and transcripts and other data were to enter the current era, this would be possible in a few years.

      Delete
  4. I create with Anonymous' point 100%. It would be a huge project to make the information consistent. One university by itself wouldn't do it. It would make more sense for a private firm to do it and sell the information to colleges.

    ReplyDelete
  5. I'll say two things in the SAT's favor:

    1) Selective colleges have to make choices somehow. The SATs are probably better than the system they used before, which AFAIK was mostly based on the applicant's social status (directly, not implicitly).

    2) The SAT can be regarded, not as an indicator of the student's suitability for college, but as an indicator of college's suitability for the student. A flighty, creative student would probably not benefit from being subjected to boring lectures among a class of hundreds where achievement is measured by testing; others take well to the collegiate environment.

    ReplyDelete