Monday, October 12, 2015

On Childlike Faith in Tests

Some blog post titles just demand your attention. Yesterday, my attention was grabbed by this one: "Tests are inhuman-- and that is what so good about them." Yes, there's an 's missing from that title, but there's a lot more than that missing from the post itself.

The writer is arguing for the value of the impartial, unbiased test. And part of her argument is solid. Teaching is most often done by human beings, and human beings are biased. Therefor it will come as no surprise that A) teachers have biases and B) if they're not careful, teachers will let those biases bleed over into their evaluation of students. This is inarguably true.

It may seem, the writer says, that teacher evaluation is nicer, more humane, but in fact the intrusion of bias can make teacher evaluation the most unkind at all, denying some students credit for their achievements and being inherently unfair. Also true.

If you want fairness, progress, equality and reliability, then human judgment may not be the best method.

Um, wait.

What other judgment is there?

Okay. You might say Judgment of God, but I believe there's a special day set for that judgment, and it's coming later. I don't think it's a significant factor in, say, ninth grade algebra.

This is what I don't get about some test devotees-- this belief that tests somehow descend from heaven on a fluffy cloud, free from human contact and cleansed of all human frailty. Impartial, perfect, and as divinely sourceless as an angel or Santa Claus.

But no. I'm pretty sure that tests are written by human beings. Imperfect, biased, judgment-making human beings.

That's fine. Making judgments is one of the most fundamental, and fundamentally necessary, human activities. We can't talk about not making judgments at all, but we can and must talk about making good judgments and about being aware of our biases as we make those judgments. That is part of what makes a teacher a professional-- an awareness of the many biases at play in a specific classroom and an ability correct for those. The "specific" part matters because context matters-- "Draw a picture of your father working?" can seem like a perfectly harmless question unless you know that Pat's father is dead and Chris's father is in prison.

In another post, this writer makes this extraordinary claim:

 I don’t think you can improve equity, teacher quality and a love of learning without some form of reliable feedback – and exams are basically the best and most accurate method of gathering feedback that we have.

Oddly enough, that in itself is a bias that would affect a classroom. Imagine a student who brings up a drawing of a butterfly and says, "How do you like this?" And the teacher replies, "Sit down, Pat. We'll have to run it through the color spectrum analysis scanner to see how you did."

A reliance on testing means that we make judgments about what behaviors, knowledge and skill is worth measuring. A belief in the perfect awesomeness of standardized tests leads us quickly to the conclusion that only things that can be measured by standardized tests are worth knowing or doing. That's baloney. And it's a huge bias.

And that's just if the tests we're discussing are reasonably decent tests. If we get to tests such as the current crop of Big Standardized Tests that ask questions requiring students to identify the single "correct" author's purpose, or select the single "correct" most effective sentence, we have now thrown in more bias about the content itself.

Sure, testing can provide useful info of some sorts on some occasion. But we are not exposing students to some perfect, bias-free, inhuman judgment. Rather, we are allowing someone else's judgment and bias into our classroom, which is perfectly okay and not a bad way to balance out our own biases-- as long as we recognize that's what we're doing.

The best way to deal with bias is to put a bell on it, acknowledge it, hear it coming, and factor it out. To try to hide it behind claims of perfect inhuman judgment is to give that bias enormous power that it does not deserve to have, and that's what writers like the blogger in question propose to do.


  1. The writer is British, and referring to a set of exams that bear no relation to the BST. A and O levels require several sets of essays and problems, with no multiple choice or blank-filling, and they are written and graded by teachers, but not the student's own - exams are anonymous.. I actually approve of this kind of assessment for college-bound seniors, but here is the irony: even THESE exams can be gamed! There is just no such thing as measuring learning without affecting learning. Kind of Heisenberg's uncertainty principle of learning..

    1. But don't such tests require human judgment to score? I know that I have to use judgment when I grade my students' philosophy essays at the college level. So, how can the author claim that these tests provide scores without involving human judgment? Am I missing something?

    2. No, you're not missing anything, the author is. She does seem to much prefer multiple choice tests, since she sees them as "objective" because they don't need human judgement for scoring, but she misses that human judgement is needed to create them. And a lot of other things about their limitations.

    3. Oh yes, there's that too - the author's odd belief that academic tests are like blood tests. Talking about evaluating performance without using human judgment is like talking about evaluating a relationship without reference to human experience.
      I suspect that the original author is really thinking about is being graded by strangers rather than your own teacher. I really hate grading the same students I teach. It seems strange to be both coach and judge. Again, the British system largely relies on exams where the assessors don't know the students, and are looking simply at their work. There's definitely a case to be made for that, again, for college-bound seniors. But of course, OF COURSE! this requires human judgement.
      I mean, who writes the bloody tests? Guinea pigs? Martians?

  2. My experience in higher education is that the teachers do not like making judgments on their students. In examination board meetings to discuss the class of degree to award, and there was a student on a borderline, I would say " we need to consider this one carefully. Others would say "but the marks say he should get a first class degree", and I would say "look at his project mark, it's well below the borderline, look at his project report, speak to his project supervisor". In many cases the marks won, as they all had more faith in them, having forgotten that the marks were a product of human activity.

    1. Yes - there's more than one way to understand "objectivity" in grading. Marks might be objective because they reflect someone's essential competencies, without reference to human judgment (which is absurd). Or they might be objective because we don't know the student involved, and thus aren't biased for or against her. There's something to this second idea of objectivity, as evidenced by all kinds of legal and professional practice (recusing yourself in cases where you have involvement, for example). But, as you say, even this "objectivity" can be called into question.
      I am one of those teachers who don't like grading the students I teach. It seems to be a conflict of interest - am I coach, or judge?

  3. Dear Mr. Peter Greene:

    This article you are writing about was written by a young person in the U.K. Apparently she's a new teacher, who finds out she isn't up to the task, and instead of looking for mentors, has decided that, “It's not ME! It's everything around ME!" She actually has the non-gender spherical entities to write a book.

    She lists herself as,”R&D manager at ARK”, (whatever the hell THAT is) and oh! “She's a “West Ham fan, cricket fan. Author of Seven Myths about Education.”

    Get a grip, Peter. One dragon at a time, and foreign dragons need not apply. We have enough local and national dragons. Stop wasting your time on her. Like “She Who Will Not Be Named”, it only gives her clicks.

    And, the second article links to the same pimply, purulent author-ess-like person?

    I regret the clicks I gave her already. I didn't need more.


  4. Christodoulou says, "...we have solid evidence that disadvantaged pupils do better on tests than on teacher assessments..." Oh? Then where's that research? The only research she cites is that teachers can be biased, but that doesn't necessarily equate to students doing better on tests. It's not the same thing. If teachers are biased towards disadvantaged students, how are those students going to do well on tests when they're not going to be taught well? And if teachers are biased against students because they have disabilities, or are of a different ethnicity, or because of their gender, or because they're ELL, or low income, then those people shouldn't be teachers. Those are the teachers who ought to be gotten rid of because those are bad teachers who can do great damage. And she doesn't give any kind of explanation of her claim that tests can instill a love of learning.

    I read some of her other posts. I agree with her that what's talked about as "21st century skills" are skills that have always been needed and taught, but I totally disagree with her idea of how to teach grammar. She talks about "why project-based education fails"; some of the examples she cites do seem pretty ridiculous to me as far as spending way too much time on one very narrow topic, but I really liked the two for social studies. I think that with any idea in education, some people go overboard and think it's a magic bullet. I think most ideas in education have merit, but they have to all be integrated together and, most importantly, you have to know where and when and how each one makes the most sense to use and not just use them indiscriminately, which is what I see happen too often.

    She seems to consider herself an alcolyte of Ed Hirsch, but I don't think she understands his ideas very well. From what I got out of Wikipedia, I like what I read about his work on text interpretation, readability, and cultural literacy, and he seems to be a big believer in the importance of using cognitive psychology, and so am I. He believes in a core knowledge that everyone should know, which sounds like it makes sense in theory, but there's so much knowledge, how do you choose what to teach? If his list is euro-centric, that may be why he's been accused of being an elitist. The funny thing is, it's been said that his focus on core knowledge led to the idea of Common Core, but Common Core seems to be the opposite of his ideas. He emphasizes a curriculum of common knowledge (which he says you need in order to be able to make judgments and think critically and understand what you read), but Common Core targets critical thinking and has NO content, and Hirsch's cultural literacy is the exact opposite of Coleman's close reading.

  5. Dear Mr. Peter Greene:

    Second comment.

    I teach in a school that uses Hirsch. Like everything else, it depends on how you do it. It can be done well or terribly. That's all I have to say about Hirsch. Any big theory can either serve people or grind them.

    The author of these two articles is very interesting if she, in any sense, exemplifies the trend of modern thought.

    Many, many people I grew up with, and also many I now know, wish that their daily activities be governed by a “Still Small Voice”, and that their worth ultimately be judged by a generous and loving Creator.

    Other people I know would be very happy to have History judge them ultimately, but they also want the approval of their peers and community every day.

    The author wishes to be judged by an impartial machine. I sincerely hope she gets what she wants.


  6. ARK is essentially the KIPP of England. They run a chain of high-scoring "academies," which is what they call charters there. So she's Research and Development Manager of an organization with influence with the Tory government. Not quite Dave Levin's level of influence, but in the same ballpark perhaps.

  7. See also

    1. Yes, I think she has a blind spot because she doesn't realize that everybody doesn't learn the same way she does.

  8. I rarely agree with Greene but I do at least partially agree with him here that tests do have biases. I used to be a tutor for the SAT and GMAT. Many years ago, the GMAT had logic questions related to family hierarchy. For example, what is the relation between you and your grandfather's nephew ? However, these were removed when it became clear that such questions unfairly penalized students who did not come from tradtional families.

    However, the key part of Greene's piece is wrong. "A reliance on testing means that we make judgments about what behaviors, knowledge and skill is worth measuring. A belief in the perfect awesomeness of standardized tests leads us quickly to the conclusion that only things that can be measured by standardized tests are worth knowing or doing."

    We are NOT saying that only those things measured by tests are worth knowing or doing. And context does matter. I recently took my young daughter to the doctor. As part of a routine visit, they measured her blood pressure. This is an objective measure like our standardized tests. I was surprised that her blood pressure was much higher than I expected. I am accustomed to numbers like 130/80. But context matters. Young children normally have much higher blood pressure than adults. And certainly the blood pressure measure was not the only thing that measured. The doctor listened to her breathing and felt her stomach for resistance. These things aren't as easily measured quantifiably. But this does NOT mean that the objective measures were irrelevant. Far from it. The measures like blood pressure, temperature, etc. were essential measures that gave the doctor some idea of where my daughter measured relative to peers.

    And if she had a high temperature, that doesn't mean that the doctor was bad. But certainly it is fair to look at measures such as the legnth of time until the child was well again, incidence of infection, etc. and compare these to peer groups. In fact, Obamacare (which liberals like) was lauded as using such "results oriented" measures. The fact that teachers don't want to be measured in any objective manner (even in context such as looking only at subsets such as improvement in low income children) strongly indicates that they just don't want to be measured ... or that if you do measure students (e.g. NAEP), such results should not reflect on the teachers. Yes, such tests may be biased (though improved) and they may be just a part of what a teacher is trying to do. But just as with doctors they are an important part of evaluating whether that student is getting an effective education from that teacher and that school.