The SAT and ACT have released their scores for the class of 2018, accompanied by the predictable wailing and gnashing of teeth about persistently low levels of STEM achievement.

As Nick Anderson of the Washington Post reports:

Forty-nine percent of students in this year’s graduating class who took the SAT received a math score indicating they had a strong chance [75%] of earning at least a C in a college-level math class, according to data made public Thursday. That was significantly lower than on the reading and writing portion of the tests: 70 percent of SAT-takers reached a similar benchmark in that area.

What the article quite remarkably fails to mention is that the benchmark verbal score, 480, is a full 50 points lower than that for math. Given the discrepancy, it is entirely unsurprising that fewer students met the benchmark in math.

Let’s try some basic — and I do mean basic — critical thinking with statistics, shall we?

To understand what a 480 verbal score on the redesigned SAT actually means, consider that it translates into about 430 on the pre-2016 exam, which in turn translates into about a 350 (!) on the pre-1995 SAT.

This is not “college ready” in any meaningful sense of the term. In my experience, students scoring in this range typically struggle to do things such as identify when a statement is a sentence, or grasp the concept that texts are making arguments as opposed to “just saying stuff.” But to reiterate one of my favorite points, this is in part why the SAT was changed: the decline in reading/writing scores was becoming embarrassing. And if you can’t change the students, the only other option is to change the test, and the scoring system along with it.

But that alone doesn’t answer the question of why there should be such a large gap between the math and verbal cutoffs. I think there are multiple issues at play here.

The first and most charitable explanation is that it is simply easier to pass an introductory college-level English classes than it is to pass college-level math classes. Grading in STEM classes is typically more rigorous than in humanities classes, and so it is not surprising that the corresponding SAT score would be lower as well. This is borne out by the ACT’s benchmarks, although whether the grading difference really translates into 50 points lower as opposed to, say, 10 or 20, is up for debate. But more about that in a bit.

If you’ll bear with me, I think it’s instructive to first take a look at the ACT’s description of its own benchmarking process.

The ACT College Readiness Benchmarks are empirically derived based on the actual performance of college students. ACT has compiled an extensive database of course grade data from a large number of first-year students across a wide range of postsecondary institutions. The data were provided through ACT’s research services and other postsecondary research partnerships.

The Benchmarks for English, mathematics, reading, and science were first established in 2005 and were updated in 2013 using data from more recent high school graduates (Allen and Sconing 2005; Allen 2013). The STEM (Mattern, Radunzel, and Westrick 2015; Radunzel, Mattern, Crouse, and Westrick 2015) and ELA (Radunzel et al. 2017) Benchmarks were established more recently. The data were weighted to be representative of ACT- tested high school students and two- and four- year colleges nationwide. (https://www.act.org/content/dam/act/unsecured/documents/pdfs/R1670-college-readiness-benchmarks-2017-11.pdf)

The key phrase here is empirically derived based on the actual performance of college students. In contrast, the SAT benchmarks were established when the test was rolled out — that is, before there was an established sample whose actual college performance could be studied over time. As a result, they cannot be considered “evidence-based” in any normally understood sense of the phrase.

Now, a reasonable assumption would be that the College Board derived its benchmarks at least in part from the ACT’s, which also show a large gap between the verbal benchmarks (English – 18, Reading – 21 and the math ones (22 – Math, 23- Science).*

Upon inspection, however, the correspondence breaks down — at least on the verbal side. According to the concordance table provided by the College Board, a 480 verbal translates into a 34 combined ACT English/Reading, or approximately 17 per section. The ACT Reading benchmark, however, is set a full five points higher, at a 22. If the College Board was actually basing its figures on the ACT, then the benchmark should have been around a 530 (the SAT equivalent of 18 + 22 = 40) — exactly what it is for the Math. Instead, it is the equivalent of a point below the lower of the two ACT verbal scores. In contrast, the ACT Math benchmark of 22 translates into a score only 10 points higher than the SAT equivalent (540 vs. 530). This is a really remarkable difference.

It’s also striking that the benchmarks for the pre-2016 SAT were set at 500 for each section, which converts to about a 560 on the redesigned exam — nearly 100 points higher. So in fact, verbal scores were doubly adjusted downward: once in absolute terms (500 to 480), and then again in relative ones (480 rSAT = 430 SAT). The math, however, corresponds exactly: a 500 on the old exam translates into a 530 on the new.

Again, why lower the verbal benchmark this much?

In answering this question, I think it’s helpful to consider the purpose of the SAT redesign, namely the intention to compete in, and displace the ACT from, the enormously lucrative state testing markets. Setting the benchmark below the ACT’s was a way of ensuring that more students would be labeled “college ready” and thus to induce states to drop the ACT in favor of the SAT. If the goal was not to produce students who are actually educated but rather to herd as many 18-year olds as possible into college regardless of their preparation, that was an entirely logical decision. Doing so would also give the impression that students’ English skills are really fine, that STEM is where the real problem lies, a narrative that journalists like the Post’s Nick Anderson have shown themselves perfectly willing to perpetuate.

That leads into the second reason, namely that the College Board has a stake in whipping up hysteria around the idea that American students are falling behind in STEM fields. More failure = more reforms = new and improved testing = more money for testing and software companies.

The subtext here is also that it is a Very Big Problem if students aren’t performing well in math because achievement in STEM = A Good Job, whereas achievement in English isn’t anything to get excited about. In fact, by setting the bar for reading and writing so much lower than the one for math, the College Board is suggesting that students can be “college ready” with only very basic skills (an implication that goes hand in hand with the dismissal of moderately challenging college-level vocabulary as “obscure”).

Another related but less obvious reason involves the potential impact on curriculum in states where the SAT is mandated for graduation. Although grammar is technically part of the Common Core Standards, it is not, at least to the best of my knowledge, explicitly tested on state tests. And if it’s not on the state tests, chances are it isn’t being covered in classrooms. (Incidentally, CC puts most of the grammar tested on the SAT in the earlier grades. As far as I can tell, the people who threw the standards together pretty much went down the list of concepts tested on rSAT and semi-arbitrarily distributed them across the elementary grades.) Setting a higher benchmark for English would likely require high school teachers to cover more grammar, which would in turn very probably require funding for professional development — funding that could not be used for STEM subjects. Realistically, given a restricted budget, which one are schools going to choose?

Then there’s the reading problem, which is among the thorniest in education. There are so many component pieces, and so many ways in which misunderstandings can occur, that there is no easy way to make students who are already behind into good readers. It is so much easier to just lower the bar and declare that things aren’t so bad after all.

 

 

*The ACT might also be accused of manipulating its statistics to show a lower baseline of achievement for success in college-level English. Unlike the figures for Reading, Math, and Science, which are based on significantly higher proportions of 4-year college students than 2-year college students, those for English are based on equal numbers of students at 2- and 4-year colleges. In the case of STEM, for which the benchmark score is a 26, only students at 4-year schools (where earning passing grades is presumably more challenging) were considered.