Special Report

May/June 2012 Grand Test Auto

The end of testing.

By Bill Tucker

In the old days, supermarkets struggled to keep track of the thousands of items on their shelves. Each month, they’d shutter the store so their employees could hand-count every soup can, cereal box, and candy bar. The first electronic scanning systems came along in the 1970s, which helped take a little of the drudgery and inefficiency out of the grocer’s life. Then came waves of advances in computing power and remote sensing technologies. By now, for most retailers, regularly shutting down to conduct inventor y is a thing of the past. Instead, they can constantly monitor their shelves through bar codes, scanners, and radio-frequency devices. And as it has turned out, all this technology has given them far more than just a better way to count cans: today, retailers not only keep track of what’s on their shelves, they also use the constant flow of real-time information to predict, analyze, and respond quickly to consumer demand.

This kind of real-time assessment and response has become a part of modern life in a number of areas. New car owners increasingly rely on remote sensors, not a yearly mechanic’s visit, to detect engine problems and keep tires at the right pressure. And more and more, diabetics no longer have to stop and inject themselves. Instead, they use a continuous glucose monitor to send blood readings to an insulin pump, which warns them if their blood-sugar level spikes and allows them to adjust their level of insulin. In each of these areas, a scientific understanding of systems—whether biological, mechanical, or commercial—has been combined with new technology to develop more useful, productive, and actionable monitoring and measurement. And all of it takes place almost invisibly, in the background.

Not so in America’s classrooms. Schools across the nation still essentially close to conduct inventory—only we don’t call it that. We call it “testing.” Every year at a given time, regular instruction stops. Teachers enter something called “test prep” mode; it lasts for weeks leading up to the big assessment. Just as grocery-store workers might try to fudge inventory numbers to conceal shortfalls in cash, schools sometimes try to fudge their testing results, and cheating scandals erupt. Then, in a twist, regular classroom instruction resumes only half heartedly once the big test is over, because there are no stakes attached to what everyone’s learning. Learning stops, evaluation begins: that’s how it works. But in the not-so-distant future, testing may be as much a thing of the past for educators as the counting of cans is for grocers.

Zoran Popovic, a computer scientist and the director of the Center for Game Science at the University of Washington in Seattle, is one of a new cadre of researchers point -ing the way to a post-testing world. Popovic has designed a prototype of an online, puzzle-based game called Refractions. The game challenges students to use their knowledge of fractions to help provide the right amount of power to animals in marooned spaceships. Using puzzle pieces, students bend lasers and split the energy beams into half, one-third, and even one-twelfth power. In the process, they get a feel for a number of important concepts, such as equal partitioning, addition, multiplication, and common denominators.

While Refractions looks like a relatively simple game, the real complexity is behind the scenes. The game records hundreds of data points, capturing information each time a player adjusts, redirects, or splits a laser. This data allows Popovic and his colleagues to analyze and visualize students’ paths through the puzzles—seeing, for example, whether a student made a beeline for the answer, meandered, or tried a novel approach. Since the data shows not just whether the student solved the puzzle, but also how, it can be used to detect misconceptions or skill gaps. Good math teachers do this all the time when they require students to “show their work”—that is, to write down not just the answer to a math problem on a test, but also the calculations they used to derive the answer. The difference is that Popovic’s game essentially “shows the work” of hundreds of thousands of players, recording data automatically in a way that allows teachers and scientists to draw robust inferences about where students tend to go astray. This would be virtually impossible with paper tests. And it’s this massive scale that promises not only new insights on student learning but also new tools to help teachers respond.

Popovic’s game is one of dozens of experiments and research projects being conducted in universities and company labs around the country by scientists and educators all thinking in roughly the same vein. Their aim is to transform assessment from dull misery to an enjoyable process of mastery. They call it “stealth assessment.”

At this point, all this work is still preliminary—the stuff of whiteboards and prototypes. Little if any of it will be included in two new national tests now being designed with federal funds by two consortia of states and universities and scheduled to be rolled out in classrooms around the country beginning in 2014. Still, researchers have a reasonably clear grasp of what they someday—five, ten, or fifteen years from now—hope to achieve: assessments that do not hit “pause” on the learning process but are embedded directly into learning experiences and enable a deeper level of learning at the same time.

In this vision, students would spend their time in the classroom solving problems, mastering complex projects, or even conducting experiments, as many of them do now. But they ’d do much of it through a technological interface: via interactive lessons and simulations, digital instruments, and, above all, games. Information about an individual student’s approach, persistence, and problem-solving strategies, in addition to their record of right and wrong answers, would be collected over time, generating much more detailed and valid evidence about a student’s skills and knowledge than a one-shot test. And all the while, these sophisticated systems would adapt, constantly updating to keep the student challenged, supported, and engaged.

One way to think of stealth assessment is to compare it to a GPS system—one that has the ability to monitor, assess, and respond to progress along the way. The metaphor is helpful, because it illuminates not only the promise of stealth assessment but also the crucial missing component that we lack now. A GPS system starts with a detailed digital map of all the roads and possible detours in a given terrain; then the system’s software constantly tracks your car’s location relative to that map. Similarly, stealth assessments will require a detailed understanding—a cognitive model or map—of all the different ways learning can progress in math, science, and various other disciplines. A student’s performance would then be tracked against the various routes and pathways that learners tend to follow as their understand -ing progresses. But while cognitive scientists have made great strides in the past two decades, our understanding of how students learn is not nearly detailed enough to resemble a full map—certainly not one that reflects the whole range of possible routes, detours, intermediary steps, and junctions created by each student’s individual strengths and weaknesses.

Bill Tucker ,since 2005 the managing direc tor of Education Sector, a D.C .-based think tank , will soon be joining the Bill & Melinda Gates Foundation as deputy director, policy development, U. S. Program. He has written about education technology, innovation, and policy for publications including Education Next, Education Week, and Educational Leadership.


  • Judy Willis, M.D., M.Ed. on May 09, 2012 2:42 PM:

    PREDICTION from a neurologist who then became a teacher (2nd, 5th, 7th) and now does professional development: Within five years in some countries (five to ten in others) open internet access for information acquisition will be available on standardized tests. This access will significantly reduce the quantity of data designated for rote memorization.

    The Current Information Load Is Too BIG
    Recall that before 1994 a student would be expelled from the SAT exams for bringing any type of calculator. Starting in 1994, calculators were not only permitted, but were essentially required. The driving factors came from the level of mathematics taught and tested and the availability of graphing calculator technology. This change gave students the appropriate tool for accuracy and efficiency (and the one used by most professionals who used mathematics beyond basic arithmetic). Consider also, that calculator access for these standardized tests did not reduce the instruction in and development of arithmetic automaticity. Mental access, of such facts and procedures as the multiplication tables and manipulation of fractions, without a calculator remains a valued goal for all students.
    We are now in the same nexus of advancement of information and technology to make the equivalent jump for other subjects. Access to the internet for information acquisition during tests (and learning) is the appropriate response now, just as the calculator access was in mathematics almost two decades ago.
    As technology and globalization exponentially increase the available facts and knowledge base of all subjects and professions, the response in education has been to incorporate more and more information into the requirements for each school year. The current system of - if its information � teach it and test it - can no longer support the volume of information. Textbooks cannot get much bigger and the impact of the increasing demands on students to memorize data is increasingly counterproductive.
    In the "real world�, professionals in all specialties and businesses use the superiority of the web over the human brain to accurately hold and retrieve facts and to keep up as �facts� change too quickly for even eBooks to be current and accurate by the time they are released.
    Many practicing physicians do not rely memory, or even textbooks or the latest journals for the most current, accurate information about diagnostic testing, best treatments, and other facts that change daily. For example, before prescribing a medication, the Medscape or Epocrates websites are searched for the most current facts that could have significant impact on a patient�s reactions to the medication. Even for a medication that has been evaluated for cross reactions with other medications when it was tested and when the FDA product information was most recently reported, new information can be critical. That medication could have just been found to cause problems when taken by patients also taking a different medication for another medical condition. Thanks to the physician having access to that new information before prescribing medications, the risk of potential complications is reduced.

    Memorization Breaking Point
    Boredom, frustration, negativity, apathy, self-doubt and the behavioral manifestations of these brain stressors the have increased in the past decade. As facts increase, over-packed curriculum expands, and demands for rote memorization for high stakes testing, the brains of our students have reacted to the increased stress. High stress, including that provoked by sustained or frequent boredom or frustration, detours brain processing away from the higher, rational, prefrontal cortex. In the stress state, the lower, reactive brain is in control. Retrievable memory is not formed and behavioral responses are limited to involuntary fight/flight/freeze � seen in

  • Judy Willis, M.D., M.Ed. on May 09, 2012 2:48 PM:


    Memorization Breaking Point
    Boredom, frustration, negativity, apathy, self-doubt and the behavioral manifestations of these brain stressors the have increased in the past decade. As facts increase, over-packed curriculum expands, and demands for rote memorization for high stakes testing, the brains of our students have reacted to the increased stress. High stress, including that provoked by sustained or frequent boredom or frustration, detours brain processing away from the higher, rational, prefrontal cortex. In the stress state, the lower, reactive brain is in control. Retrievable memory is not formed and behavioral responses are limited to involuntary fight/flight/freeze seen in the classroom as act-out, zone-out, or drop out.

    Students Don't Get the Brains They Need
    Even if the medical, social, psychological, and ethical problems do not promote the change in testing, the economic demands as to what employers want as employee skill sets will inevitably topple the factory model of education.
    The factory model of memorization of facts and procedures that was preparation for assembly line work cannot keep up with the information age requirements for an educated workforce. With the growing in the information base, employers in global industries that develop new products or systems already report they are more interested in a potential employees' abilities to respond quickly and successfully to frequent change, and to communicate, lead, and collaborate, than they are in their like work experience. Desirable employees are those capable of making use of new information and technology to solve new problems and innovate ahead of the competition.
    The lives our students will live and the jobs for which they'll compete will not be about answering questions correctly, but about how they use knowledge and respond to changes. Yet currently the time sacrificed to fact memorization and test prep is resulting in more high school dropouts and students graduating from the secondary system without the preparation to succeed in college, employment, or to lead fulfilling lives.

    Freedom from excessive rote fact memorization focus means teachers can be creative individually and as professional learning communities. There will be reduction of the "management" problems that currently result from stressed-brain reactive behavior. Educators will be able to develop and use more engaging, relevant, and equitable learning experiences enhancing cross-curricular skills and competences. More access to foundational facts, which are not equally acquired by some students with language or learning differences, will mean they are not held back from applying other strengths to build conceptual knowledge and understanding. as students are guided with learning opportunities that develop their executive functions they will develop understanding beyond just knowing. Their extended their neural networks will empower them transfer knowledge to new applications as we help them build the brains to achieve their greatest creative potentials.

    Read Complete Comment in my upcoming staff blog for EDUTOPIA.org Education's Next Big Bang May 11. WEBSITE www.RADTeach.com

  • skeptonomist on June 07, 2012 10:34 AM:

    What might actually be done to evaluate student performance is largely irrelevant, because the current testing regime has been imposed for largely ulterior motives - the desire to break teacher's unions, cut down on expenses, prove that public schools are inferior and allow higher-income people to send their children to private schools without paying school taxes, etc. Thus the support for real improvements in teaching is actually limited, especially as funds are being choked off for public schools. For-profit schools are generally not going to spend a lot of money on novel methods.