Sunday, May 24, 2026

Growth or Proficiency

Some of us are apparently still having this debate.

Jill Barshay wrote a piece for Hechinger Report about the DC school district, which is apparently awesome at growth but not so great at actual achievement levels. The piece does a good job of revisiting the debates about these two sorts of measures; I'd just like to add a point or three.

First, let me point out for the gazillionth time that we are not talking about student achievement and we are certainly not (as Barshay unfortunately does) talking about years of learning.  A "year of learning" or "month of learning" or "fortnight of learning" or an "afternoon of learning" is just a journalist-friendly way of packaging test results. 

We are talking about scores on a Big Standardized Test. That's it.

Barshay notes that "A school system can improve rapidly and still leave most children behind." Well, yes. Which students have more room for improvement-- those who are already at the top of their game, or those who are scoring in the basement?

Students who are bringing up the rear academically can be given more test prep, instruction that goes straight to what the test covers as well as instruction on how to take the test itself (Here's how to avoid being tricked by distractors in multiple choice questions). Students at the top of the game may well be growing and developing, but the BS Test measures such a sliver of skills (and no knowledge at all) that their growth doesn't register (You've been developing insights into quantum theory? That's will not raise your test score). 

This was always part of the debate over tying teacher evaluation to student scores. Focus on growth, and teachers of honors classes are in trouble, because a student who's already at the 98th percentile isn't going to grow at all. Focus on proficiency scores, and the teachers who are assigned the low-achieving students are in trouble, because no matter how well they teach those students, they will still lag (no, Virginia, there is no magical technique for "catching up" students quickly-- if there was, teachers would use it all the time). 

Worse, when policy bases teacher or school evaluation on proficiency, it turns the lowest achievers into hot potatoes. We've seen this in action where charter and voucher schools work hard to avoid those low-scoring students who would mess up their numbers. When Steven Wilson is cited in the article pointing to charter schools with low-income students and high levels of proficiency, he's simply pointing to the effects of creaming, where schools do their best to avoid having their numbers damaged by low-scoring students. There is no magic trick there that can be applied "at scale" for the public system. 

Ultimately, schools can not win playing the growth measurement game because schools cannot raise student scores every year forever, as if somehow each cohort of students was smarter than their older siblings. Test scores are not a stock market ticker.

But schools also cannot win the proficiency game. BS Test scores and "grade levels" are scaled and normed (curved). If the BS Test were truly standards based, students taking the test could be scored instantly after they clicked the last answer. But the scores have to be computed and compared and scaled and then some state bureau sets the cut scores. But curves have to have a bottom. If, after years of intensive effort, every child tested above grade level for reading, we would not conclude that a reading education moonshot had occurred-- we would conclude that "grade level" had been set too low. If every child was rated "proficient," we would conclude that the requirements for "proficient" had been made too easy (just check every piece complaining about grade inflation). 

Does test score growth tell us something? Absolutely. Does it tell us everything, or even most of the things? Absolutely not.

Do test score levels tell us something? Absolutely. Do they tell us everything, or even most of the things? Absolutely not.

The growth vs. proficiency debate is in many ways a debate about how to make the best use of a tiny, noisy slice of data. Instead, I wish we were talking about what we really should be measuring, how we can measure it, and how we are going to deal with the fact that there is much about educational quality that cannot be measured in any way that will satisfy our data overlords. Some days we are wasting way too much energy arguing about whether we should cut baloney into slices or cubes when we'd be better off figuring out how make a healthier meal.

No comments:

Post a Comment