Monday, January 11, 2016

Ranking Is Not Measuring

This point came up in passing a few days ago when I was reviewing some writing by Mark Garrison,
but it is worth hammering home all by itself.

We have been told repeatedly that we need to take the Big Standardized Tests so that we can hold schools accountable and tell whether our teachers are succeeding or not. "Of course we need accountability systems," the policy makers say. "Don't you want to know how well we're doing?"

And then we rank schools and teachers and students. But ranking is not measuring.

Would you rather be operated on by a top-ranking surgeon or one who was the bottom of his class? What if the former is the top graduate of Bob's Backyard School of Surgical Stuff and the latter is the bottom of Harvard Medical School? Would you like homework help from the dumbest person in MENSA or the smartest person in a 6th grade remedial class? And does that prompt you to ask what we even mean by "dumb" or "smart"?

"But hey," you may reply. "If I'm going to rank people by a particular quality, I have to measure that quality, don't I?"

Of course not. You can find the tallest student in a classroom without measuring any of them. You can find the heaviest box of rocks by using a scale that doesn't ever tell you how much they weigh. Ranking requires no actual measurement at all.

Not only that, but when we are forced to measure, ranking encourages us to do it badly. Many qualities or characteristics would best be described or measured with a many-dimensional matrix with a dozen different axes. But to rank-- we have to reduce complex multidimensional measurement to something that can be measured with a single-edged stick.

Who is most attractive-- Jon Hamm, Ryan Gosling, or George Clooney? It's an impossible question because it involves so many factors, from hair style to age to wry wit vs. full-on silliness all piled on top of, "Attractive to whom, exactly?" We can reduce all of those factors and  measure each one independently, and that might create some sort of qualitative measure of attractiveness, but it would be so complicated that we'd have to chart it on some sort of multi-matrix omni-dimensional graphy thing, and THAT would make it impossible to rank the three gentlemen. No, in order to rank them we would either have to settle on some single measurement that we use as a proxy for all the rest, or some bastard offspring created by mashing all the measures together. This results in a ranking that doesn't reflect any kind of real measurement of anything, ultimately both meaningless and unconvincing (the ladies of the George Clooney fan club will not change allegiance because some data-driven list contradicts what they already know in their hearts).

In fact, when we create the bastardized mashup measurement, we're really creating a completely new quality. We can call it the Handsomeness Quotient, but we might as well call Shmerglishness.

So let's go back to "smart," a word that is both as universally used and as thoroughly vague as "good" or "stuff." Smartitude is a complex of factors, some of which exist not as qualities but as relationships between the smart-holder and the immediate environment (I'm pretty smart in a library, average under a car hood, and stupid on a basketball court). Measuring smart is complicated and difficult and multi-dimensional.

But then in the ed biz we're going to fold that quality into a broader domain that we'll call "student achievement" and now we are talking about describing the constellation of skills and abilities and aptitudes and knowledge for an individual human being, and to rank requires to use a single-axis shmerglishness number.

We could go on and on about the many examples of how complex systems cannot be reduced to simple measures, but I want to go back and underline that main idea--

Ranking is not measuring. In fact, ranking often works directly against measuring. As long as our accountability systems focus on ranking students, teachers, and schools, they will not tell us how well the education system is actually working.


  1. Brilliant! Such clear analogies! Even though it can't really be measured, I feel like my smartness quotient has been raised from reading this. :)

  2. I feel the same way about the Top 100 blogs. The answer is always, without fail, "compared to what?"

    Also: George Clooney.

  3. In most psychometrics courses, ranking is actually consider one form of measurement (ordinal scale) based on S.S. Steven's classic quadra-chotomoy:

    We could argue whether this makes it "scientific" measurement (I tend to agree with Joel Michell and Paul Kline and others who claim only ratio scale measurement is truly scientific), but it is generally accepted by psychometricians and statisticians who perform (for example) ordinal regression analyses, etc.

    If ordinal scale variables are not "really scientific" (like most science), then neither are interval scale variables, like any score on any scaled standardized test of mysterious latent mental traits. But I probably don't need to try to convince you of that one.