Tuesday, June 23, 2015

McGrading the McTest

In Monday's New York Times, journalist Motoko Rich gives a master class in how to let the subjects of a story make themselves look ridiculous.

The piece takes us to San Antonio to give us a look at how the Big Standard Tests are actually graded, and in doing so, shows how the BS Tests are not one whit more advanced than the old school bubble tests they claim to replace.

Rich begins by noting that the scoring is not necessarily (or even probably) done by actual teachers. But Pearson's vp of content and scoring management is here to reassure us with this astonishing quote:

“From the standpoint of comparing us to a Starbucks or McDonald’s, where you go into those places you know exactly what you’re going to get,” said Bob Sanders, vice president of content and scoring management at Pearson North America, when asked whether such an analogy was apt.

“McDonald’s has a process in place to make sure they put two patties on that Big Mac,” he continued. “We do that exact same thing. We have processes to oversee our processes, and to make sure they are being followed.”
This is not news, really. For years we've been reading exposes by former graders and Pearson's advertisements in craigslist. It can be no surprise that the same country that has worked hard to teacher-proof classrooms would also find a test-scoring method suitable for folks with no educational expertise.

How low does the bar go? Consider this quote from one scorer, a former wedding planner who immigrated from France just five years ago:

She acknowledged that scoring was challenging. “Only after all these weeks being here,” Ms. Gomm said, “I am finally getting it.”

Sigh. I cut and pasted that. It is not one of my innumerable typos.

Look, here's the real problem revealed by this article (and others like it).

The test manufacturers have repeatedly argued that these new generation tests are better because they don't use bubble tests. They incorporate open-ended essay(ish) questions, so they can test deeper levels of understanding-- that's the argument. A multiple choice question (whether bubbling, clicking, or drag-and-dropping) only has one correct answer, and that narrow questioning strategy can only measure a narrow set of skills or understanding.

So essays ought to be better. Unless you score them like this, according to a narrow set of criteria to be used by people with no expertise in the area being tested. If someone who doesn't know the field is sitting there with a rubric that narrowly defines success, all you've got is a slightly more complicated bubble test. Instead of having four choices, the student has an infinite number of choices, but there's still just one way to be right.

Nobody has yet come up with a computerized system of grading writing that doesn't suck and which can't be publicly embarrassed. But if you're going to hire humans to act like a computer ("Just follow these instructions carefully and precisely"), your suckage levels will stay the same.

If it doesn't take a person with subject knowledge to score the essay, it doesn't take a person with subject knowledge to write it.

So the take-away from Rich's piece is not just that these tests are being graded by people who don't necessarily know what the hell they're doing, but that test manufacturers have created tests for which graders who don't know what the hell they're doing seems like a viable option.  And that is just one more sign that the Big Standardized Tests are pointless slices of expensive baloney. You can't make a test like McDonalds and still pretend that you're cooking classic cuisine.


  1. "... about 100 temporary employees of the testing giant Pearson worked in diligent silence scoring thousands of short essays written by third- and fifth-grade students from across the country.

    "There was a onetime wedding planner, a retired medical technologist and a former Pearson saleswoman with a master’s degree in marital counseling. To get the job, like other scorers nationwide, they needed a four-year college degree with relevant coursework, but no teaching experience. They earned $12 to $14 an hour, with the possibility of small bonuses if they hit daily quality and volume targets."
    Put yourself in the shoes of one of these non-teacher temps who make $12-14/hour:

    --- the faster you grade tests---i.e. the more half-assed and the more rushed your grading is---the greater the volume of tests that will get graded by you... and the more money you'll make, and the greater likelihood you'll keep your job...

    ... while conversely...

    --- the more careful and caring a job you do in grading essays, the less money you'll make, and the greater likelihood you'll get canned.

    This whole market-based factory model basically incentivizes doing an increasingly horrible job grading these essays, and disincentivizes doing a thorough and fair job of grading.

    Those who actually take time to do the job right, are penalized and presumably fired for not reaching their "quota" of tests graded.

    And these wildly inaccurate "results" will then be used to judge the "quality" of both teachers and schools, with those of poor quality fired (teachers) or closed (schools.)

    These fired teachers will probably end up getting replaced by those bonus-earning essay graders who can churn out the most number of essays graded, and the traditional public schools that are closed will end up getting re-opened under the management of Acme Charter Schools Inc., a privately-run for-profit corporation.

  2. It's somewhere between amusing and pathetic that they're actually trying to defend themselves by comparing themselves to McDonald's.

  3. Yes. Yes. Yes. This is why I don't use rubrics when grading student essay exams and papers in my college philosophy classes. There are just too many ways to write a good response and there is no set of necessary and sufficient material that needs to be in each one of those possible good responses.

    Instead of a rubric I use my vast experience in the field. I read every word that they write and I grade the entire essay or answer as a whole while still managing to look at all the parts. It is hard work, and it takes a lot of time, especially when I'm giving comments in the early part of the semester. But I don't know any other effective method of teaching my students to think and write in a serious way about philosophy.

  4. Everybody should read Todd Farley's book, Making the Grades: My Misadventures in the Standardized Testing Industry -- it's a fascinating, albeit depressing, read.

  5. What stood out to me is the vice president of content and scoring comparing grading an essay to putting two patties on a big mac, and having "processes to oversee our processes." He sounds like he's trying to do a parody.