Tuesday, May 29, 2018

Testing with Purpose

I've spent the last week buried in end-of-year exams, which for us came right on the heels of the Keystone Exams, Pennsylvania's version of the Common Core Big Standardized Tests. For me, it's a pretty blunt reminder of how tests are supposed to work, and how the BS Tests fail to meet that standard.

A good test is part of the learning process.

A good test does not require education to stop while we measure one thing or another. A good assessment, well-written and properly timed, helps the students bring their learning (so far) into some sort of crystallized relief. The learning process is like collecting a bunch of books and stacking them up in the center of the room and building some shelving on the wall. A good assessment consists of getting those books organized and neatly place on the shelves.

A good assessment involves applying the skills and knowledge acquired-- not rooting through stacks for some detail from some page. A good assessment is as close as possible to a real and authentic task. For instance, the best assessment of writing skills involves writing something-- not answering a bunch of multiple choice questions about made-up faux writing scenarios, or writing some faux essay so strictly formulated that all "correct" versions would be nearly identical.

The end goal for most courses involves thinking-- thinking about the content, thinking about the ways it fits together, thinking about the ways that the information can be organized and worked within that discipline. Thinking is the great in-measurable, but it is also the end goal of most courses, and so we end up back at writing as the closest possible authentic assessor of what we're really after. (Multiple choice tests almost never measure thinking, critical or otherwise.) And the best writing assessments are built to match the ebb and flow of the course itself.

In my 11th grade Honors English class, I assign two massive take-home essays every year. Because the literature portion of the course is organized around the study of American literature through five different literary periods and across an assortment of topics, a typical year's final might include "Trace one of the following topics through the five literary periods we studied, showing how each movement handled the topic and providing examples from works of the period" or "Pick a fairy tale and rewrite it five times, as it would be written by an author of each period we studied." But through student questions and curiosity and engagement and just the odd paths that we sometimes wander down, I have also given this question as a final essay: What is the meaning of life? Explain and discuss.

I don't object to all objective tests. In addition to my big honking essay tests, I give an in-class mostly matching test that requires the students to recognize works of literature from short clues ("his sister was dead, but she got better" = "Fall of the House of Usher") and even such pedestrian tasks as matching a work with its author. My goal, at a minimum, is to have them finish the year by thinking, "Damn, but I read a lot of stuff this year" and at a maximum to show that they have some rudimentary content under control. But even here, the clues that I give are based on how we discussed the work in class and not my own selection of some piddly detail.

In the end, I'd argue that no good assessment is divorced from the entire learning process that led up o it nor from the end goals and purposes of that unit for the student. The BS Tests are divorced from both. They have nothing to do with the organic natural flow of education in the classroom and were written with no knowledge of or regard for that group. Nor do they promise any sort of culminating learning activity for the students, but instead are intended to generate some sort of data for a disconnected third party.

It's as if a basketball team, after practicing for a month, didn't wrap up that work with an actual game, but instead took a multipole choice test about characteristics of basketballs, hoops, and shoes. It's as if a band or chorus, after rehearsing music for a month, did not put the final capstone of a performance on their learning, but instead sat down at computers to take a point-and-click test about the bore size of a trombone and the average range of sopranos.

A good final assessment is the icing on the cake, the lightbulb over the head, the victory lap around the track that has been mastered. The BS Tests are none of these things, but instead are a collection of pointless tasks doled out by faceless bean counters for purposes known only to far off bureaucrats. When students say that these tests are pointless (and they do, all the time), they aren't saying "this isn't even part of my grade" so much as they're saying, "this doesn't add anything to my education." When legislators say these tests are pointless (as they do every time they artificially attach stakes to them in an attempt to make them seem Important), they admit that they are wasting a lot of money and a huge amount of time.


  1. First, thank you for writing this. I had hoped for some time you would reconcile your dislike of standardized tests with your presumptive acceptance of your own tests.

    And I'd certainly concede several points that you make. "And the best writing assessments are built to match the ebb and flow of the course itself." Sure, no one can write as relevant a test as the teacher of the course themselves. And I'd also concede your point that testing and teaching must be iterative and transparent in order for testing to serve its educational purpose.

    But I wonder if you miss part of the objective of standardized tests ? Would you concede that the 3.2 million teachers in this country vary considerably in effectiveness ? (When I say "effectiveness", I don't simply mean the ability to help students elevate their scores on a standardized test.) Rather, in some more broad sense of effectiveness, would you agree that there is variance ? OK. Well, what those standardized tests attempt to do - albeit imperfectly - is to compare teachers across very different geographies. In a sense, the tradeoff for the imperfection in the test (which is not calibrated to the particular course) is that it is for more comparable.

    Now, you can retort that students are unmotivated. Or that the next step after finding a struggling teacher or school should be to help them rather than cull them. All that and more are fair questions. But aren't these questions separate from the measurement itself ? To object to a standardized measure because you dislike the next step is like someone who may be overweight refusing to stand on the scale because she doesn't like the next step - exercise or diet or whatever.

    If you really do not object to distinguishing more effective teachers from less, but are simply skeptical of the yardstick itself, perhaps you might consider using your retirement to build a better test ?