The race to fix America’s broken system of standardized exams.
That changed in 2001 when President George W. Bush signed the No Child Left Behind law, under which, for the first time, the federal government itself was demanding that school districts be held responsible for the performance of their students. The tests that gauged this performance measured minimum competency, and they required states to show that their stu -dents were making yearly progress toward the goal of becoming proficient. They also led to a fundamental change—many would say for the worse—in the relationship between testing and instruction. Whereas the original goal of achievement tests was to improve instruction by providing educators with useful information, says Daniel Koretz, an assessment expert with the Har vard Graduate School of Education, the new goal was to improve instruction by holding someone accountable for results. Koretz calls this shift “the single most important change in testing in the past half century.”
NCLB has had some successes. Because it requires states to break down data among racial and other demographic groups, it has identified significant achievement gaps, and in most states those gaps are narrowing. It deser ves credit for inducing broad gains in achievement in the key subjects of reading and math (even as it crowds out other subjects), and it has encouraged teachers to use data to shape instruction. But more than ten years after the law ’s passage, 50 percent of schools are not making the Adequate Yearly Progress required by the law.
The greatest drawback of NCLB, meanwhile, is the one that so unnerves Caryn Voskuil: the tests it spawned ask students to restate and recall facts rather than to analyze and interpret them. It turns out that this is largely a legacy problem: because the stakes of standardized tests in America before NCLB were historically very low, states had no interest in paying much for them. And when those stakes got higher and states did need measures of accountability, they simply used or replicated the cheap tests they already had. They did so, in part, because each state had its own standards, and thus needed its own tests. That fragmented demand, along with the need for lightning-fast scoring, led to a shortage of experts to build the tests, as well as downward pressure on the profit margins of testing companies. The troubles in the industry, according to Thomas Toch, a senior fellow with the Carnegie Foundation for the Advancement of Teaching, created a strong incentive for states and testing contractors to write tests that measure largely low-level skills.
When President Obama took office in 2009, he inherited all the flaws of NCLB and standardized testing. But just as frustration with the law was reaching its height, he was also handed an opportunity: the common core movement, an initiative of state governors and the heads of large school systems, was guiding the states toward uniform academic standards, thus solving one of the biggest obstacles to improving tests and raising achievement. Using the vast pool of money established by the 2009 federal stimulus package, Obama prodded the movement along. Specifically, he allocated $330 million for the states to design a cutting-edge, state-of-the-art, nationwide test. In laying out this challenge, the administration established the following guidelines: the tests should be aligned to the new high standards; they should measure deeper learning; they should be computerized; and they should be capable of being used to evaluate not just students but educators. Oh—and they would have to be up and running in classrooms by 2014.
The states banded together to embrace the challenge, and eventually they winnowed themselves into two pioneering R&D teams (with, alas, anesthetically long names): the Partnership for Assessment of Readiness for College and Careers (PARCC) and the Smarter Balanced Assessment Consortium (SBAC). Each team, or consortium, is producing its own tests, but their scoring systems will be comparable—as those of the ACT and the SAT are to each other—so there will essentially be one national benchmark of readiness for college and careers. Over the past several months, these networks of state assessment directors, teachers, college administrators, content experts, and psychometricians have been racking up frequent-flyer miles and phone minutes, hashing out all the intricacies of twenty-first-century assessment. It’s not exactly astronauts and rocket scientists in The Right Stuff—but their efforts may ultimately have as much or more of an impact on the country.
Because the consortia are still letting out contracts, they won’t have test prototypes until this summer. But already the outlines of the two projects are taking shape. Both groups are designing interactive computerized tests that will have far more essays and open-response questions, more practical math exercises, and more word problems than current models. They will both use more nonfiction and informational text in addition to literary text. Both also call for fast machine scoring. The groups have similar goals for the long term, but PARCC, whose assessments won’t even be fully computerized until 2016, is less ambitious and more practical in the short term.
Since they are required by NCLB, most of the tests offered by both consortia will be “summative,” meaning that they summarize the development of learners at a particular time. PARCC, which represents a collaboration between twenty-one different states, is focusing on these kinds of tests, which states will use to hold educators accountable and to judge students’ readiness for college. It will have two assessments, and in a big departure from current practice, one will include performance tasks, such as asking a student to analyze a text using evidence to support claims or having him apply math skills and processes to solve real-world problems. At the end of the year, it will combine these into one summative score. PARCC will also have a speaking and listening test graded by a teacher.
The SBAC, a collaboration between twenty-six different states, will also create summative tests, but it will also develop “formative” assessments—tests that are used to gauge student progress in midstream and help teachers make course corrections. A formative test, which takes place during a sequence of instruction, can consist of anything from calling on a student in class to giving him a math quiz or assigning him a lab report. In each case, the teacher uses the resulting information to adjust her instruction. Designers of the next-generation tests believe that some standardized assessments can be formative, as well.
Feed the Political AnimalDonate
Washington Monthly depends on donations from readers like you.