"The Big Problem with the New SAT"

05/06/2015

From the NYT, an op-ed that got ripped by commenters:

The Big Problem With the New SAT

By RICHARD C. ATKINSON and SAUL GEISER MAY 4, 2015

Richard C. Atkinson is president emeritus of the University of California. Saul Geiser is a research associate at the Center for Studies in Higher Education at the University of California, Berkeley.

Atkinson was the guy behind the 2005 introduction of the ill-fated Writing portion of the SAT and the dropping of the analogies questions. As the biggest customer of the College Board, he swung a lot of weight by threatening to have the U. of California schools develop their own entrance exam. He considered himself something of a testing expert: a little learning is a dangerous thing.

AT first glance, the College Board’s revised SAT seems a radical departure from the test’s original focus on students’ general ability or aptitude. Set to debut a year from now, in the spring of 2016, the exam will require students to demonstrate in-depth knowledge of subjects they study in school.

Judging by the newly released PSAT, this isn’t very true.

The verbal test, for instance, looks like it will reward kids who read Jane Austen, Slate, and Wired in their spare time and it will likely beat up kids with IQs below about 110.

Really, why wouldn’t we want to have one part of the admission process reward high school students who read outside of school? Aren’t they more likely to benefit from and contribute to higher education?

The revised SAT takes some important, if partial, steps toward becoming a test of curriculum mastery. In place of the infamously tricky, puzzle-type items, the exam will be a more straightforward test of material that students encounter in the classroom. …

And the biggest problem is this: While the content will be new, the underlying design will not change. The SAT will remain a “norm-referenced” exam, designed primarily to rank students rather than measure what they actually know.

Such exams compare students to other test takers, rather than measure their performance against a fixed standard. They are designed to produce a “bell curve” distribution among examinees, with most scoring in the middle and with sharply descending numbers at the top and bottom. “Criterion-referenced” tests, on the other hand, measure how much students know about a given subject. Performance is not assessed in relation to how others perform but in relation to fixed academic standards.

In other words, set a low standard so lots of kids can be A-students.

Assuming they have mastered the material, it is possible for a large proportion, even a majority, of examinees to score well; this is not possible on a norm-referenced test.

We already have one area of grade-inflation, why do we need another?

K-12 schools increasingly employ criterion-referenced tests for this reason. That approach reflects the movement during the past two decades in all of the states — those that have adopted their own standards, as well as those that have adopted the Common Core — to set explicit learning standards and assess achievement against them.

You know, uh, Harvard doesn’t actually want its 30,000 applicants to be an indistinguishable mass of kids who surpassed some minimum threshold. Harvard was the chief force behind the development of the modern testing system in the first 60 years of the 20th Century. And I gotta bet that looking back from today, Harvard, with its $36 billion endowment, feels pretty good about how its strategy worked out.

Norm-referenced tests like the SAT and the ACT have contributed enormously to the “educational arms race” — the ferocious competition for admission at top colleges and universities.

Only to the extent that the SAT and ACT are actually fairly good at discriminating among the better students. But the history of testing in China shows that Tiger Parents will compete like mad over any kind of test that gets their offsprings a leg up: e.g., the Mandarin exam used to be on knowledge of the Confucian classics. The supply of freshman seats in the Ivy League plus Stanford and MIT has barely ticked upward in 30 years, while the number of people wanting to get their kids into those schools has vastly increased. That’s the heart of the supply-demand equation, not the kind of tests used.

This creates great pressure on students and their parents to avail themselves of expensive test-prep services in search of any edge. It is also unfair to those who cannot afford such services.

And how exactly is shifting the focus of testing from innate cognitive skills to quality of education experienced going to benefit the non-rich?

By design, norm-referenced tests reproduce the same bell-curve distribution of scores from one year to the next, with only minor differences. This makes it difficult to gauge progress accurately.

Maybe there isn’t a whole lot of progress?

… And by rewarding students’ efforts in the regular classroom, criterion-referenced exams reduce the importance of test-prep services, thus helping to level the playing field.

That certainly wasn’t the experience of the Chinese with their exams on Confucius.

The actual history of testing in the U.S. suggests that the introduction of SAT type tests kept test prep down for a few decades until the Tiger Mothers finally overwhelmed them.

Look, the only way for any kind of testing to reduce competition is to set the maximum score so low that everybody of any ambition gets the maximum. It’s currently a little bit like that with the Advanced Placement tests, where the tests are scored on a 1 to 5 scale. But, typically, only a few percent of all 17-year-olds in the country get a 5 on AP tests. And students then compete on how many AP tests they can take.

I’ve long argued that AP tests should be given more weight in college admissions because if kids are going to spend hundreds of hours in test prep they might as well spend it studying a real subject like chemistry or US history. But colleges don’t like to emphasize AP tests too much because to do so would very much benefit the upper ranks of society, who have far more access to AP courses because they send their kids to school with other smart, ambitious students whose parents demand AP classes.

In summary, Atkinson, the former chancellor of the University of California system, doesn’t seem very good at reasoning about bell curves and testing systems. I was going to say that maybe that’s just hard to do, but the comments at The New York Times are full of incisive critiques of Atkinson’s lunkheadedness.

So maybe our problem is that our society doesn’t select for public leaders who are good at reasoning statistically.

< Previous

Next >