A list of College Board failures

Thanks to a blog reader for submitting this comment.  

How many abysmal fails has College Board committed/experienced in the past 18 months?

The test (June 2015) with a misprint on the last sections about how much time students had that led to uneven test administrations across the country.

Disruptions in sending October (or was it November) scores to colleges because they started using their new system for score distribution – for the old test scores – before most admissions office had shifted over to that system (because this was still the old test format).

Multiple breeches of security in different parts of Asia during multiple administrations (don’t we now expect to read about those breeches after every test administration?)

Making a testing site re-administer the test because CB lost / couldn’t find the student answer sheets. (This item barely made the news: http://www.kvue.com/news/local/williamson-county/sat-answer-sheets-lost-students-retake-saturday/274699783 )

Yet several states are replacing the state high school exams (PARCC or other) with the College Board’s SAT.  Here is a report from Connecticut about the results of using a college admissions test as the state-wide test for high school: http://www.educationworld.com/a_news/results-connecticut%E2%80%99s-first-statewide-sat-exam-reveal-sobering-achievement-gap-1119406746

Maybe it is because I wasn’t paying as close attention back then, but it seems as if these sorts of debacles are occurring with alarming regularity in the Coleman ‘Redesign Era’ of the SAT.

Anyone care to add to the list? 

How the new SAT could affect the tutoring industry

In my last post, I took the College Board to task for its boast that its partnership with Khan Academy has led to a 19% decrease in the use of paid prep, presumably defined as classes or tutoring, although the College Board fails to specify. Aside from the questionable basis for that statistic (exactly how was it obtained? what were the characteristics of the groups surveyed? how were the demographic changes incurred by the adoption of the SAT as a state test taken into account?), I do think it’s worth exploring the question of just how the new SAT might affect the tutoring industry.

For what it’s worth, I’ve heard from a number of tutors that their business is actually up this year, although those tutors tend to work with students for whom free, online prep is borderline irrelevant anyway.

I’m also aware that most experienced tutors are pushing their students toward the ACT for the foreseeable future. If there was indeed a drop in paid SAT preparation, it was almost certainly in some part due to students paying for ACT preparation instead. 

What interests me here, however, is the assumption that students will be the ones driving the changes.

But what if it goes the other way as well? What if it turns out that tutors don’t want to prepare students for the new SAT?

When I chose not to tutor the new test, I wondered whether I was overreacting. I even felt a bit petty. But then I talked to an SAT veteran who has run a popular tutoring centering for several decades. She informed me that when the test changed, she would be closing the center and retiring from test-prep. Another colleague, an award-winning teacher and author with an extraordinary knowledge of the old SAT, informed me that he had decided to step back from the new test as well and switch, grudgingly, to the ACT. And so another colleague, who had meticulously charted the many problems plaguing the College Board and concluded that she could not be involved with the new exam. She switched to the ACT as well. And yet another colleague, this one far from retirement age, shuttered her SAT prep business and stopped tutoring entirely. This is not just a local phenomenon either: these colleagues are located in four states in three very different regions of the country. 

A different set of colleagues I’ve talked to are nominally tutoring the new exam, but strongly guiding their students toward the ACT whenever possible. They tutor rSAT as necessary, but with misgivings — sometimes very deep misgivings. (One tutor confessed to me that she had actually started to cry when she looked at it for the first time.)

To be clear, I don’t begrudge anyone for tutoring the new test. Businesses have to accede to the needs of their clientele, and changes or not, the SAT will undoubtedly remain entrenched in areas where it has traditionally dominated. It’s certainly not reasonable to expect people to turn away business just because they think a test is poorly written. And I have no doubt that some of them are outstanding teachers who will do their utmost to ensure students walk away having learned something of value.

I do, however, wonder whether the new test will produce a longer-term shift in the type of people attracted to SAT tutoring — and whether that shift will mirror the shift already occurring in education as a whole.

Last winter, when I was interviewed as part of group of tutors for Michael Arlen Davis’s documentary The Test, Michael mentioned that one of his unexpected findings during the course of making the film was that SAT tutoring was such an interesting niche profession, one that attracted people he genuinely enjoyed talking to and bouncing ideas off of. Certainly, the group of us that he interviewed was an exceptionally loquacious one, with some very outsize personalities. We also made it pretty clear to him that we did not suffer faulty reasoning gladly!

I think it’s fair to assume that SAT tutoring has traditionally attracted so many people in that category because the test itself was interesting to tutor. Even if a lot of it felt routine after a while, there was always a particularly deviously constructed question that forced you not only to think, but to step back and admire the sheer ingeniousness of the test. (Admittedly, this was only possible if you had continuing access to released QAS exams). There were just enough curveballs to keep people on their toes, and it was always rewarding to move kids from reading and thinking on a high school-level to something much closer to an adult level. Plenty of long-time tutors stumbled into SAT prep by accident, then got sucked in. That was certainly the case for me.

With that kind of cleverness all but absent from the new exam, a huge component of what made SAT tutoring appealing has effectively been eliminated.

Students may find the test stultifying, but many of them have no choice but to take it. Tutors, on the other hand, have no such obligation. And I find it telling that given the option to walk away, some of them are choosing to do so.

I have no idea whether this is a relatively isolated situation that applies only to a limited number of people who happen to have the luxury of deciding what they want to teach, or whether it’s indicative of a larger trend. But I know that I now hear other private tutors voicing the same kinds of complaints about Common Core drivel that I only used to hear from people involved in the public school system. 

I seriously wonder whether as time goes on, fewer of the over-educated, quirky types who often make such outstanding teachers will continue to fall into SAT tutoring (at least on the verbal side). If they do somehow end up in test prep, I suspect that they will be more likely to tutor exams that have some level of adult interest. As a result, SAT students may ultimately be less likely to be taught by tutors who actually know something about English, and more likely to end up working with people who can only parrot the College Board’s empty jargon (“relevant words!” “evidence-based reading!”), if not outright abuses of language. 

One could certainly make the argument that reducing the market for expensive tutoring is an effective means to level the playing field, but looked at the other way around, the situation raises real questions about the quality of the test. After all, if people who could be earning at minimum $100/hour are jumping ship for ethical reasons rather than tutor something as theoretically innocuous as a standardized test, something might not be quite right. 

College Board logic

The following passage is excerpted from a recent College Board press release: http://www.prnewswire.com/news-releases/one-year-since-launch-official-sat-practice-on-khan-academy-is-leveling-the-playing-field-for-students-300278934.html.

A year ago today, Official SAT® Practice for the new SAT went live on KhanAcademy.org, making free, world-class, personalized online practice available for all students. There are now more than 1.4 million unique users on Official SAT Practice on Khan Academy — this represents four times the total population of students who use all commercial test prep classes in a year combined. Data show that the practice platform is reaching students across race, ethnicities, and income levels — mirroring the percentage of SAT takers. Almost half of all SAT takers on March 5 used Official SAT Practice to prepare, causing a 19 percent drop in the number of students who paid for SAT prep resources.


Which of the following would most directly undermine the College Board’s assertion that the number of students using Official SAT Practice was responsible for the 19 percent decline in the number of students paying for SAT prep resources? 

(A) Providers of SAT prep resources have begun offering low-cost preparation programs in order to more effectively compete with Khan Academy.
(B) Students who sign up for free Official SAT Practice on Khan Academy typically use that platform as their only resource for SAT preparation.
(C) The number of students who took the SAT this year was larger than the number of students who took the SAT in recent years. 
(D) Students who in the past would have taken the SAT and paid for SAT preparation resources instead took the ACT and paid for ACT preparation resources. 
(E) More students registered for free Official SAT Practice on Khan Academy than registered for the paid prep previously offered by the College Board. 

(Scroll down for the answer)











Answer: D

Virtually every experienced tutor I’ve heard from has made a concerted effort to steer students toward the ACT this year. Furthermore, families who pay for test-prep also tend to be savvy enough to want their children to prep for an exam that is a known entity, and for which a substantial body of authentic practice material exists, and to not be offered up as guinea pigs.

The decision to use the SAT as a state graduation test also provides a possible explanation for the alleged drop in paid test prep. I’ve heard from other sources that non-required SAT registration was actually down by about 20% this year. It therefore stands to reason that a significant percentage of the students taking rSAT were students who would not have taken the test, had they not be required to do so by their schools. This is a major shift in the test-taking population, and it makes comparisons between the pre-March 2016 group and the post-March 2016 group very difficult. The students taking the test only because of a school requirement would almost certainly not have paid for test preparation in the first place.

If the College Board’s statistics are correct (and based on recent revelations, there’s considerable reason to question whether that is in fact that case), it seems likely that a combination of factors produced the drop. There are undoubtedly many students who are using Khan Academy exclusively, but 1) many of those students would not have paid for test prep in previous years anyway; and 2) some of those students will not meet their goals through Khan prep alone and will sign up for a class or decide to work with a tutor. Summer is when both of those things are most likely to happen, also calling spring statistics regarding paid prep into question. Given the extent to which Khan was touted, it also seems reasonable to assume that more students deliberately waited for their scores before deciding whether to opt for paid prep. 

Never mind the fact that registering as a “unique user” on Khan in no way indicates that a student will use the site for the type of consistent, sustained study required for improvement, or even that they’ll ever bother to log in again. Some students certainly will use its offerings to maximum advantage, but again, those are the super-focused self-starters who would have worked on their own regardless. And they are a very small minority. (Note to tech geniuses: the fact that you had the drive to sit yourself down at the age of 16 and teach yourself calculus for fun does not mean that the average 16 year-old non-tech genius can do the same.)

ETS may have its problems, but at least its employees tend to understand the very basics of logic, like, oh, say, the difference between correlation and causation. 

It’s statements like these that really force one to question the College Board’s ability to produce a reasoning test. 

Former College Board executive blows the whistle about the new SAT

Manuel Alfaro, a former executive director at the College Board, has written a series of posts on LinkedIn detailing the myriad problems plaguing the development of the new exam. 

According to Alfaro, not only were many of the items developed for the first administration of the test extraordinarily problematic (see below), but many of the items that appeared on the test were not actually reviewed by the Content Advisory Committee until after the test forms had been constructed.

Committee members repeatedly attempted to call David Coleman’s attention to the problem, but were ignored. 

Alfaro was also responsible for rewriting and rubber-stamping the redesigned test specifications in order to hide the fact that they were taken directly from the Common Core middle- and high school standards. (Because, of course, the College Board expected everyone to somehow forget that David Coleman, the head of the College Board, was also responsible for Common Core.) 


The College Board completed the backstory for the test specifications by citing reports/analyses performed by independent groups as evidence of the alignment between the redesigned SAT’s research-based, empirical backbone and the Common Core.

Those “reports/analyses” were then used to persuade states such as Colorado and Michigan to drop the ACT in favor of the SAT, giving the College Board thousands of students in additional market share. 

And furthermore: 

Of the many concerns raised by the Content Advisory Committee, here are the top three:

Item Quality: Committee members were very concerned with the quality of the items the College Board brought to committee meetings for review. Their biggest concern was the large number of items that were mathematically flawed; items that did not have correct answers; and items that did not have accurate or realistic contexts. Some members even went as far as stating that they had never seen so many seriously flawed items.

Development Schedule: Committee members felt that schedules did not allow them enough time to perform thorough reviews. Given the large number of items they had to review (and the poor quality of the items), they needed more time to provide meaningful comments and input.

Development Process: Committee members felt that the process used to develop the items was inadequate. They felt that the process lacked the rigor required to produce the high quality items necessary for item data to be useful. (https://www.linkedin.com/pulse/shining-spotlight-dark-corners-college-board-concerns-manuel-alfaro?trk=mp-reader-card)

Alfaro also indicates that an abnormally high number of items were revised, often to the point of being completely rewritten, after being pre-tested. As a result, some questions that appeared on the actual test had effectively never been vetted. 

At least this explains why the College Board wouldn’t let tutors into the first administration of the new exam. The only reason to surround a test with that type of secrecy is to try to hide how poorly written the test is. If the College Board had so few items benchmarked for validity, it would also explain why the March test was reused in June. 

Alfaro has also started a petition to ask the White House to investigate the College Board’s misdoings, but even if he does succeed in getting enough signatures, if I suspect that might be akin to asking the fox to check up on the henhouse. Coleman and the Common Core crew have deep ties to the Obama administration, via Arne Duncan. These are problems that go all the way to the top. 

Is the College Board reusing new SATs already?

According to the chatter on College Confidential, some students are reporting that they received June SATs identical to their March tests.

At this point, it’s also common knowledge that Asian test-prep companies have been distributing the March test. Inevitably, then, some lucky students will have prepped for the June exam using…the June exam. (As if barring adults  from non-released exams was ever going to prevent this sort of occurrence.) 

This comes just as the College Board and Khan Academy announce that they have successfully leveled the playing field among test-takers. 

The College Board has been in the habit of recycling tests for quite a while, but it would stand to reason that three months isn’t quite long enough to wait. 

Somehow I don’t think this is what the College Board meant by “transparency.” 

Is the new SAT really the PARCC in disguise?

In the spring of 2015, when the College Board was field testing questions for rSAT, a student made an offhand remark to me that didn’t seem like much at the time but that stuck in my mind. She was a new student who had already taken the SAT twice, and somehow the topic of the Experimental section came up. She’d gotten a Reading section, rSAT-style. 

“Omigod,” she said. “It was, like, the hardest thing ever. They had all these questions that asked you for evidence. It was just like the state test. It was horrible.” 

My student lived in New Jersey, so the state test she was referring to was the PARCC. 

Even then, I had a pretty good inkling of where the College Board was going with the new test, but the significance of her comment didn’t really hit me until a couple of months ago, when states suddenly starting switching from ACT to the SAT. I was poking around the internet, trying to find out more about Colorado’s abrupt and surprising decision to drop the ACT after 15 years, and I came across a couple of sources reporting that not only would rSAT replace the ACT, but it would replace PARCC as well.

That threw me a little bit for a loop. I knew that PARCC was hugely unpopular and that a number of states had backed out of the consortium, but still… something smelled a little funny about the whole thing. Why would states allow PARCC to be replaced by rSAT? They were two completely different tests…right?

I mulled it over for a while, and then something occurred to me: Given that any exam that Colorado administered would have to be aligned with Common Core (or whatever it is that Colorado’s standards are called now), it seemed reasonable to assume that the switch from PARCC to rSAT could only have been approved if the two tests weren’t really that different.

At that point, it made sense to actually look at the PARCC. Like most people in the college admissions-test world, I had really had no reason to look at the PARCC before; state tests were uncharted territory for me.

Luckily, PARCC had recently released a broad selection of 2015 items on its website — more than enough to provide a good sense of what the test is about. After a modicum of fruitless hunting around (not the easiest website to navigate!), I managed to locate the eleventh grade sample ELA questions. When I started looking through them, the overlap with rSAT was striking. Despite some superficial differences, the similarities between the two tests were impossible to miss. Even if there were some differences in the lengths of the passages and way in which the questions were worded, the two tests were definitely cousins. Close cousins. I asked a couple of other tutors about the Math portion, and they more or less concurred — not identical, but similar enough.

On one hand, that wasn’t at all surprising. After all, both are products of Common Core, their development overseen by Coleman and Co. As such, it’s hardly surprising that they embody the hallmarks of Coleman’s, shall we say, heavy-handed and amateurish idiosyncratic approach to analysis of the written word.

On the other hand, it was quite striking. The PARCC, unquestionably, was developed as a high school exit test; the SAT was a college entrance test. Why should the latter suddenly bear a strong resemblance to the former? 

Just as interesting as what the tests contained was what they lacked — or at least what they appeared to lack, based on the sample questions posted on the PARCC website. (And goodness knows, I wouldn’t want to pull a Celia Oyler and incur the wrath of the testing gods.)

Consider, for example, that both rSAT and PARCC:

  • Consist of two types of passages: one “literary analysis” passage and several “informational texts,” consisting of science/social topics but, apparently, no humanities (art, music, theater).
  • One passage or paired passage from a U.S. historical document.
  • Focus on a very limited number of question types: literal comprehension, vocabulary-in-context, and structure. Remarkably, no actual ELA content knowledge (e.g. rhetorical devices, genres, styles) is tested. 
  • Rely heavily on two-part “evidence” questions, e.g. “Select the answer from the passage that supports the answer to Part A” vs. “Which of the following provides the best evidence for the answer to the previous question?” The use of these questions, as well as the questionable definition of “evidence” they entail, is probably the most striking similarity between rSAT and PARCC. It is also, I would argue, the hallmark of a “Common Core test.” Considering the number of Standards, the obsessive focus on this one particular skill is quite striking. But more about that in a little bit. 
  • Test simple, straightforward skills in bizarrely and unnecessarily convoluted ways in order to compensate for the absence of substance and give the impression of “rigor,” e.g. “which detail in the passage serves the same function as the answer to Part A?”
  • Include texts that are relatively dense and include some advanced vocabulary, but that are fairly straightforward in terms of structure, tone, and point-of-view: “claim, evidence; claim, evidence,” etc. There is limited use of “they say/I say,” or the type of sophisticated rhetorical maneuvers (irony, dry humor, wordplay) that tend to appear in actual college-level writing. That absence is a notable departure from the old version of the SAT and, contrary to claims that these exams test “college readiness,” is strikingly misaligned with college work.

Here I’d like to come back to the use of two-part “evidence” questions?” Why focus so intensely on that one question type when there are so many different aspects of reading that make up comprehension?

I think there are a few major reasons.

First, there’s the branding issue. In order to market Common Core effectively, the Standards needed to be boiled down into an easily digestible set of edu-buzzwords, one of the most prominent of which was EVIDENCE. (Listening to proponents of CCSS, you could be forgiven for thinking that not a single teacher in the United States — indeed, no one anywhere — had ever taught students to use evidence to support their arguments prior to 2011.) As a result, it was necessary to craft a test that showed its backers/funders that it was testing whether students could use EVIDENCE. Whether it was actually doing such a thing was beside the point. 

As I’ve written about before, it is flat-out impossible to truly test this skill in a multiple-choice format. When students write papers in college, they will be asked to formulate their own arguments and to support them with various pieces of information. While their professors may provide a reading list, students will also be expected to actively seek sources out in libraries, on the Internet, etc., and they themselves will be responsible for judging for whether a particular source is valid and for connecting it logically and convincingly to their own, original argument. This skill has only a tangential relationship to even AP-style synthesis essays and almost zero relationship to the ability to recognize whether a particular line from paragraph x in a passage is consistent with main idea y. It is also very much contingent upon the student’s understanding of the field and topic at hand.

So what both the PARCC and rSAT are testing is really not whether students can use evidence the way they’ll be asked to use it in college/the real world, but rather whether they can recognize when two pieces of information are consistent with one another, or whether two statements expressed different ways express the same idea. (Incidentally, answers to many PARCC “evidence” question pairs can actually be determined from the questions alone.) 

The problem is that using evidence the way it’s used in the real world involves facts, but facts = rote learning, something everyone agrees should be avoided at all costs. 

Besides, requiring students to learn a particular set of facts would be so politically contentious as to be a non-starter (what facts? whose facts? who gets included/excluded? why isn’t xyz group represented…? And so on and so forth, endlessly.)

When you only allow students to refer back to the text and never allow them make arguments that involve anything beyond describing the words on the page in fanciful ways, you sidestep that persnickety little roadblock.

Nor, incidentally, do you have to hire graders who know enough about a particular set of facts to make reliable judgments about students’ discussions of them. That, of course, would be unmanageable from both a logistical and an economic standpoint. Pretending that skills can be developed in the absence of knowledge (or cheerily acknowledging that knowledge is necessary but then refusing to state what knowledge) is the only way to create a test that can be scaled nationally, cheaply, and quickly. Questions whose answers merely quote from the text are also extraordinarily easy to write and fast to produce. If those questions make up half the test, production time gets a whole lot shorter. 

The result, however, is that you never actually get to deal with any ideas that way. You are reduced to stating and re-stating what a text says, in increasingly mind-bending ways, without ever actually arriving at more than a glancing consideration of its significance. If high school classes become dedicated to this type of work, that’s a serious problem: getting to knock around with ideas that are a little bit above you is a big part of getting ready to go to college. 

You can’t even do a good old-fashioned rhetorical analysis because you don’t know enough rhetoric to do that type of analysis, and acquiring all that rhetorical terminology would involve “rote learning” and thus be strictly verboten anyway.

The result is a stultifying mish-mash of formal skills that tries to mimic something kinda high level, but that ends up being a big bucket of nonsense. 

There is also, I think, a profound mistrust of students baked into these tests. A friend of mine who teaches high school tells me that the administrators at her school have, for several years now, been dogging the teachers with the question “how do you know that they know?” Translation: what data have you collected to prove to the powers that be that your students are appropriately progressing toward college and career readiness? In addition to vaguely mimicking a high-level skills, forcing students to compulsively justify their answers in multiple-choice format gives those powers that be quite a lot of data.

There also seems to be a latent fear that students might be trying to pull one over on the their teachers, on the administration — pretending to understand things when they’re actually just guessing. That, I suspect, is a side-effect of too many multiple-choice tests: when students actually write things out, it’s usually a lot clearer what they do and don’t understand. But of course it’s a lot harder to reduce essays to data points. They’re far too messy and subjective.

So the result is to try to pin students down, force them to read in ways that no one would possibly read in real life (oh, the irony!), and repeatedly “prove” that they understand that the text means what it means because it says what it says… There is something almost pathetic about the grasp for certainty. And there’s something a good deal more pathetic about teachers who actually buy into the idea that this type of low-level comprehension exercise is some sort of advanced critical thinking skill that will magically make students “college ready.” 

But to return to my original point: one of the most worrisome aspects of the whole discussion about the SAT and the PARCC in regards to the state testing market is that the former is presented as a genuine alternative to the latter. Yes, the SAT is a shorter test; yes, it’s produced by the College Board rather than Pearson (although who knows how much difference there is at this point); yes, it’s paper-based. But it’s really just a different version of the same thing. The College Board is banking on the fact that the SAT name will deter people from asking too many questions, or from noticing that it’s just another shoddy Common Core test. And so far, it seems to be working pretty well.