“We do practice tests and practice tests and more practice tests. And then we do the real thing. We take a lot of tests.” But do you learn from the tests? “Not really.”
Nicholas Nieves, 4th grader, Press & Sun-Bulletin (7/13/10)
For the last decade, K – 12 education has been driven by high-stakes testing, primarily administered in grades 3 – 8 in the areas of English language arts and mathematics. As a result of Race to the Top (RTTT), most states will soon administer even more high-stakes tests. For example, in the 2014-15 school year, 25 states, including New York, will begin administering The Partnership for Assessment of Readiness for College and Careers (PARCC) assessments in grades 3 – 12 four times per year. At the same time, there is an inexorable push to tie teacher and administrator performance reviews, tenure decisions and pay to student performance on these and other assessments.
I am conflicted about high-stakes testing in education. On one hand, if I needed surgery, I would certainly select a surgeon who had passed his boards over one who had not. On the other hand, I have a friend who had to take the Law School Admission Test more than once in order to score high enough to be admitted to University at Buffalo Law School, where after his second year he was celebrated as the number one student in his class. Such anecdotes simplify a very complex issue, but they do illustrate the inherent tensions within it.
In The Test Generation, Dana Goldstein refers to Campbell’s Law to illustrate the risks of high-stakes testing: “The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor…Campbell’s Law is especially applicable to education; there is a preponderance of evidence showing that high-stakes tests lead to a narrowed curriculum, score inflation, and even outright cheating among those tasked with scoring exams.”
Here are four specific traps that I believe implementers of RTTT and the PARCC assessments will have to avoid if their reforms are to bear fruit.
Trap #1: The discrepancy between what is measured and what is desired
Education reformers want to improve the education system. They implement high-stakes tests as a critical strategy to achieve that goal. However, it is fair to ask whether high-stakes tests improve the education system or simply produce good test takers. Donella Meadows writes in Thinking in Systems: “Systems, like the three wishes in the traditional fairytale, have a terrible tendency to produce exactly and only what you asked them to produce…If quality of education is measured by performance on standardized tests, the system will produce performance on standardized tests.”
Will RTTT and the PARCC assessments result in a high-quality educational system, or just more really good test takers?
Trap #2: “Erase to the Top” and the incentive to cheat
A recent USA Today expose revealed that among some high performing Washington, DC public schools, there were suspiciously high erasure rates on high-stakes tests. To be flagged, “a classroom had to have so many wrong-to-right erasures that the average for each student was 4 standard deviations higher than the average for all D.C. students in that grade on that test. In layman’s terms, that means a classroom corrected its answers so much more often than the rest of the district that it could have occurred roughly one in 30,000 times by chance.”
Despite that very high threshold, from “2008 to 2010, 103 public schools in the District of Columbia were flagged for having at least one class of students with statistically high rates of wrong answers that were erased and replaced by correct answers on standardized tests. That represents more than half of the schools in the system.”
Among the schools flagged for wrong-to-right erasures were eight of the 10 campuses where Chancellor Michelle Rhee “handed out so-called TEAM awards ‘to recognize, reward and retain high-performing educators and support staff.'” (Rhee awarded more than $1.5 million in bonuses on the basis of big jumps in 2007 and 2008 test scores.) “At three of the award-winning schools – Phoebe Hearst Elementary, Winston Education Campus and Aiton Elementary – 85% or more of classrooms were identified as having high erasure rates in 2008.”
How will states avoid the “Erase to the Top” effect as they implement RTTT and the PARCC assessments and tie them to administrator and teacher performance reviews, tenure decisions and pay?
Trap #3: Rigidity and conformity replace flexibility and creativity
The pressure to improve on ELA and math assessments has produced many deleterious effects. Student time with the social sciences, art and music has been reduced or eliminated. Drill and kill has replaced depth and breath in many classrooms. Divergent thinking is dominated by convergent thinking. The need to provide a correct answer limits students’ opportunity to ask the right question.
How will states ensure a rich, robust educational experience that promotes creative thinking and engages student minds in multiple disciplines?
Trap #4: Stripping a complex, creative process down to a mechanical one
Demands upon teachers to cover ever-expanding curricula, prepare for increasing numbers of assessments and conform to mechanistic pedagogic practices have constrained teacher flexibility and creativity within the classroom. This disturbing trend is exactly the opposite of what is occurring in the private sector where creativity and flexibility are highly sought attributes.
A recent Economist blog sums up this concern succinctly: “Governments and voters are confronted with a phenomenon they are desperate to improve, but can’t measure. What goes on in a classroom is a social phenomenon that can’t be effectively captured through standardised measurements. But they need a number. So they’re creating standardised measurements to get one. But immediately, the application of the measurement and its incentives changes the way the phenomenon is organised. A complex, creative process is stripped down to a mechanical one designed to produce high test scores.”
How will implementation of RTTT and the PARCC assessments recognize and promote teaching as a complex, creative process and not diminish it as mechanistic and rote?
An authentic, high-stakes test of RTTT and PARCC effectiveness
Here is how I will know whether the RTTT and the PARCC assessments have been implemented effectively. Years from now, I will read an interview with a fourth-grader from a local elementary school who comments:
“Every day I love to go to school. We learn, and learn, and learn. And then occasionally we take a test.” Do the tests effectively assess your learning? “Yes, absolutely!”
Brian Preston says
Sean,
Great post here. You’re hitting on all cylinders. The Economist blog reference was wonderful–everyone should read it. Daniel Koretz, the Harvard assessment guru, advised New York on the quality of our assessments, which contributed to the increased difficulty levels we’re experiencing now. His reflections on the narrowing of the curriculum due to testing are being ignored by all states. The Colorado value added report is truely discouraging.
When 3 of 4 business fail in 5 years, why are we attempting to turn education into a business model? We have to deal with the social issues of urban poverty that overwhelm the low performing schools everywhere. Testing and rating teachers is not a way to deal with poverty and second language familial challenges.