Saturday, May 23, 2015

Is There a Good Standardized Test?

It's a fair question, and one I've actually thought about often in the past few years. Have I ever encountered a standardized test that I found useful, or can I even imagine such a thing?

The short answer is, "No." But that also the glib-and-not-very-useful answer, so let me see if i can explain why not?  (My previous attempt to answer the question is here.)

Those in existence?

In thinking about existing standardized tests, I don't have much to consider. As a secondary teacher, I historically haven't dealt with nearly so many of these as my elementary brethren and sistren.

Of course, the giant of high school standardized testing has always been the SAT, and we have always understood that it's a lousy measure of what it claims to measure. For years it claimed to test student verbal reasoning skills when, in fact, it mostly just tested vocabulary. It also arguably tested student ability to think like middle class white kids. On top of that, it was, of course, highly gameable, as witnessed by the cottage industry of books, software and coaches generating revenue by helping students raise their scores.

And as omnipresent as the SAT's were and are, if we apply the Beneficial To Students and Teachers test, the SAT's fail. I learn nothing useful about my practice from SAT results, and my students take nothing away except their score. "Hey, these SAT results show the cognitive and knowledge areas in which I need to improve. I think I will take a summer school course in English just so I can work on them," said no high school junior ever.

The closest I've ever come to a useful standardized test would be the materials that came with an excellent literature series that I used years ago. The questioning strategies were excellent-- but that series only provided the questions. I was still grading the materials myself, so not quite a standardized test.

Could I do it, though. Could I finance my retirement by developing a grand and glorious English standardized test that would be useful to students and teachers across America?

I would face two challenge areas-- skills, and knowledge. Let me consider them separately.

Simple Skills

This would seem to be the easy area, at least for measuring simple skills. After all, shouldn't we be able to design a simple and useful standardized test for measuring, say, the skill of properly using commas in a sentence?

Probably not. My typical standardized test question will involve some sort of task involving comma use, say something like this:

Bob (1) you really annoy me (2) when you put the ocelots (3) hamsters (4) and beavers in the bathtub.

Commas should be inserted in 
      a) 1, 2, 3, 4
      b) 1, 3, 4
      c) 2, 4, 6, 8
      d) The Treaty of Versailles

Except that the skill of answering questions like this one is not the same as the skill of correctly using commas in a sentence. Proof? The millions of English teachers across America pulling their hair about because twenty students who aced the Comma Usage Test then turned in papers with sentences like "The development, of, language use, by, Shakespeare, was highly, influential, in, the Treaty, of Ver,sailles."

The theory is that Comma Use is a skill that can be deployed, like a strike force of Marines, to either attack writing a sentence or answering a test question, and there are certainly some people who can do that. But for a significant portion of the human race, those tasks are actually two entirely separate skill sets, and measuring one by asking it to do the other is like evaluating your plumber based on how well she rewires the chandelier in your dining room.

In other words, in order to turn a task into a measurable activity that can be scaled for both asking the question and scoring the answer, we have to turn the task we want to measure into some other task entirely.

Complex skills

Not a chance. I would like my students to be able to read an entire work and draw out some understanding of themes, character, writing technique, literary devices, and ideas about how the world works; and then to relate to all of that in some meaningful, personal way that they can express clearly and cogently.

The AP test comes as close as anything to handling a complex of skills like this, and they still add the element of "adjust your ideas and the presentation thereof to fit the preferred format and approach of the people delivering the test." The AP test also is delivered to a self-selected sliver of the whole school market-- if we tried to scale it out to every student in America, we would not get useful results.

As I've argued elsewhere, none of these critical thinking skills will ever be on a standardized test.


Well, what about knowledge. Can't we use a standardized test to see if students Know Stuff like the author of The Sun Also Rises or the contents of the Treaty of Versailles?

Probably? Maybe? At least as long as we stick to things that are simple recall. And while knowing a foundation of facts can keep us from saying ridiculous things (like "Hitler and Lincoln signed the Treaty of Versailles" or "American students have the worst test scores in the world"), there's a good argument to be had about the value of simple recall in education.

There's a reason that people associate standardized tests with simple recall and rote learning-- because that's the one thing that standardized tests can actually measure pretty well.

But more complex knowledge and understanding, the kind of knowledge that really only works its way into the world by the use of critical thinking and application-- that kind of knowledge doesn't make it onto a standardized test because it can't.

Context and Shared Language

Designing tests is one of the most challenging part of my job, particularly because years ago I concluded that I needed to stop using tests over from year to year.

See, for any of our higher order work, context and shared language matter, and that changes from year to year.

First, if I am going to be open to my students in my classroom, my instructional focus is going to shift from year to year. Understand, I am not not NOT a teacher who believes in a student-directed classroom. We don't take a vote on what we want to study, and I don't leave them unguided to somehow suss out the layers of Romantic poetry on their own. I am not the Sage on the Stage, but I am the adult in the room who's paid good money to direct and organize the learning, and I am supposed to know more about this stuff than my teenaged students, and pretending I don't is just a silly lie.

But all that said, I have to leave space for them, take cues from them, and sometimes follow their lead. There is no more powerful tool in the classroom than student curiosity, and I would be a fool not to follow it when it rears its rare and beautiful head.

All of which is a long way of saying that my instruction every year is shaped, to a greater or lesser degree by my students, their strengths, their weaknesses, their interests, and the things that just kind of come up. Which leads to

Second, our shared language in the classroom. By the time an assessment rolls around, my students should have an idea of what I mean by "explain" or "support." Heck, by the end of the year, they should know what I mean by "write a few paragraphs" about a topic. One of the most basic functions of an academic pursuit is to develop shared language, or more exactly, a shared understanding of the language. Because anybody who wants to tell you that a word has only one exact meaning that is understood by every single language user-- that person is a dope (though, yes, while we may not agree on exactly what I mean by "dope," you get a general idea).

This is why Test Prep is a thing and will always be a thing-- because the test manufacturers have a special language which has never been shared with the students, and so, somehow (test prep) the students have to enter into that shared language so that they can understand what it is, exactly, the test is asking.

So frequently BS Test results only tell us how well (or not) the student acquired the unshared language of the test manufacturer, and not how much skill or knowledge the students possess.  This may well work better for math (though I have doubts), but in reading, literature and writing, there simply is no universally shared academic language, which means that all standardized English tests are written in a special English-ish foreign tongue. That inexactness would not be a big deal if we were not imbuing these tests with superhuman powers of analysis.

A good assessment is the culmination of what we've done, not the the reason we did it in the first place. Bottom line-- I can't write a really good test for students I've never met, unless we've somehow all agreed on the language that we're using and the nature of the content we're testing. Common Core was arguably an attempt to bridge that gap, and, gosh, that's just working out so very well.

Testing Testing

In fact, my classroom practice over the decades has moved slowly and steadily away from testing and toward other sorts of assessment, because all tests ultimately and primarily test the student's ability to take a test. Now, that's not the end of the world-- there are such pointless activities in life and in some cases, testing gets us close enough to the heart of the matter to do.

But any kind of assessment ultimately has to be about the teacher trying to find out what the student knows and can do, not, as is sometimes the case, about making the student prove something to a Higher Authority.

The search for a good, useful assessment (or constellation of assessments) is an ongoing one, a journey that none of us will ever complete. But I am pretty sure that standardized tests lie in the opposite direction. As I said at the top, perhaps when we're dealing with smaller children with fewer filters and simpler skills, there are useful standardized tests (though watching my wife teach first grade, I have my doubts). But at the high school level, I think not. Consider that the use of standardized testing in college classrooms is not exactly widespread.

Poor Charles. Ask a simple question in a 140-character medium, and get this monstrosity of an essay in response. It probably would have been easier just to read the Treaty of Versailles.


  1. Special Education teachers and reading specialists use sort-of "standardized" tests that are actually fairly helpful in diagnosing certain impediments in a students' decoding or comprehension skills, but they are time-consuming and must be administered to one student at a time.

  2. Standardized math tests provide NO useful information to teachers. Without seeing the kids' thought processes (as you cannot with a standardized tests, and no, not on the "drop and drag" SBAC tests either), you learn NOTHING. You cannot tell whether they made a simple error "accidentally losing a negative sign", or two simple errors that cancel each other out (this happens more often than you would think, actually), or multiplying a simple problem inaccurately (a problem which rears its head even with math teachers and EVERYONE no matter how smart, experienced...), or if they truly have NO IDEA how to solve the problem, but can rule a few answers out for other reasons and are good or lucky guessers....

  3. One of your best! We used an ACT product (I forget which one) at my open-admissions liberal arts college to assign students to developmental reading and comp classes, and they were OK for the purpose, but we also allowed instructors to reassign kids who were obviously ready for English 111 during the first week or two of the semester. That was for very basic skills, though. Anybody who thinks PARCC, etc., can measure critical thinking, uh, needs to re-examine their critical thinking skills.

  4. One of your best! We used an ACT product (I forget which one) at my open-admissions liberal arts college to assign students to developmental reading and comp classes, and they were OK for the purpose, but we also allowed instructors to reassign kids who were obviously ready for English 111 during the first week or two of the semester. That was for very basic skills, though. Anybody who thinks PARCC, etc., can measure critical thinking, uh, needs to re-examine their critical thinking skills.

  5. If you've ever watched a kinder or 1st grader use a mouse, you'll get why many computerized tests are GIGO (garbage in / garbage out). I've watched them select an answer and by the time they manage to steer the cursor over to the Done button (clicking as they go) they don't notice how they've already changed their answers several times by the time they finally hit submit. Even when the tests read some of the questions and answer choices to them, they don't listen - they click on the first choice that has anything in it that looks familiar. So after this big waste of time generating invalid data for the district/state, I have to assess them all individually in person to find out what they really do or don't know so we can plan instruction.

  6. "The theory is that Comma Use can be deployed, like a strike force of Marines," says a person who has obviously never deployed any Marines and doesn't know how hard it can be.

    Thanks for every grumpy old word you write.