See, for example, Bloomberg’s recent testimony before Congress, in which he said that “over the past six years, we’ve done everything possible to narrow the achievement gap – and we have. In some cases, we’ve reduced it by half.”
An analysis by the National Center for Education Statistics, the research arm of the federal Education Department, concludes that no achievement gaps have narrowed at all in New York City between 2003 and 2007. The only gap that moved in any significant direction is the one between poor students and the rest of the population, which widened slightly, that analysis said. The National Center for Education Statistics also concludes that upward trends in the reading scores of black and Hispanic fourth-graders lauded by Mr. Klein are not statistically significant.
In the article, Joel Klein reveals his statistical illiteracy:
“Those are just confidence levels. Nobody is saying this is a science," Mr. Klein said. He added: "If three points is flat, and four points is statistically significant, then what you're doing is, you're playing something of a game."
Chief press officer David Cantor called the memo from NCES "a politicized gloss.”
According to NYC’s results on the state exams, the situation is more complicated. The achievement gap is narrowing in some areas when one looks at “proficiency” levels, that is whether a student is at a level 1, 2, 3, etc., but not in terms of the actual scale scores.
Some testing experts consider proficiency levels less meaningful than scale scores, as they can be arbitrary, subjective and easy to manipulate. Daniel Koretz, a professor at Harvard and a national expert on testing, has just published the must-read book of the summer, Measuring Up: What Educational Testing Really Tells Us. Here is what Koretz has to say about proficiency levels:
The best thing about the Koretz’ book is his lucid explanation of why “test score inflation” inevitably occurs when you attach high-stakes to exams, and how this undermines the integrity and validity of the results; this has increasingly been the case throughout the nation as a result of NCLB, but even more here in NYC, as a result of the increasingly high-stakes policies of the Bloomberg/Klein administration.
Steve Koss has written about this eloquently on our blog, in relation to Campbell’s Law: “The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.”
I have tried to explain this phenomenon to many elected officials, staff, and reporters over the years, apparently with little success. I certainly don’t know a single NYC media outlet that has ever mentioned it, though Campbell’s Law was cited in some recent letters to the NY Times in response to the administration’s experiment to pay students for high scores.
I recall a lengthy discussion of this issue several years back with NY Times reporter David Herszenhorn while he was still on the education beat, to explain my opposition to the Mayor’s newly-proposed 3rd grade retention policy. One of the reasons I so vociferously opposed this policy, and still do, was not just that it was unfair to the student to base such a life-altering decision on the basis of one single, fallible test score, with such large margins of error; and not just that retention has been shown to have a racially-disparate impact and hurt rather than help most low-performing students.
My opposition was also due to the fact that the more significant consequences are attached to any test, the less its results can be trusted as a reliable gauge of real learning.
Since then, of course, the administration has piled on more and more high-stakes consequences -- for students, teachers, and schools – by adding fifth and seventh grade retention, awarding principals, teachers and students monetary rewards for high scores, and threatening to close down schools if scores don’t improve fast enough. The scores themselves have been rendered entirely meaningless as a result, as excessive test prep, teaching to the test, cheating, and other strategies to “game” the system has totally overtaken our schools.
"One might expect that with the huge increase in the amount of testing in recent years, we would know more…Ironically, the reverse is true. While we have far more data now than we did twenty of thirty years ago, we have fewer sources of data that we can trust. The reason is simple: the increasing in testing has been accompanied by a dramatic upsurge in the consequences attached to scores. This is turn has created incentives to take shortcuts --- various forms of inappropriate test preparation, including outright cheating – that can substantially inflate test scores, rending trends seriously misleading or even meaningless.”
Their laissez-faire attitude is revealed by the total lack of interest evinced in following-up on even well-documented cases of cheating. (See for example this story in the NY Sun, which though it says the DOE is “investigating” this will likely lead nowhere, as such stories have in the past.)
Q: What about complaints about the report-card grades for schools?
A: The report cards were probably one of the noisy periods. But . . . I can't tell you how many principals said to me, 'You know, chancellor, I didn't get the right grade but I promise you I won't get the same one next year,' so I think that had a big impact.