Thursday, April 13, 2017

New Merit Pay Study Hits The Wrong Target

We're all going to be hearing about a piece of research, a working paper that suggests that teacher merit pay works. Sort of. Depending on what you mean by "works."

Matthew G. Springer, an assistant professor of public policy and education at Vanderbilt University, has produced a meta-analysis (that's research of the research) entitled "Teacher Merit Pay and Student Test Scores: A Meta-Analysis" in which he concludes that merit pay is connected to increased student test scores. Springer is also the director of the National Center on Performance Incentives,"a national research and development center for state and local policy" housed by Vanderbilt (he's actually had that job longer than his professor position).

During the past several decades, policymakers have grown increasingly interested in innovative compensation plans, including performance-based pay for K-12 educators. Yet, efforts to reform pay have lacked grounding in a scholarly base of knowledge regarding the effectiveness of such plans.

So I'm not sure whether the center's mission is "see if this stuff works" so much as it is "prove this stuff works," which is a somewhat less objective mission. And Springer does some worjk outside of Vanderbilt as well, like his post on the advisory board of Texas Aspires, where he sits with Rick Hess (AEI), Mike Petrilli (Fordham), Erik Haushek (Hoover Institute), Chris Barbic (Reformster-at-Large, now apparently with Arnold Foundation)and other reformy types.

Springer certainly has some ideas about teacher pay:

"The bottom line is the single-salary pay schedule does not allow systems to reward the highest performing teachers," Springer said. "These teachers deserve a six-figure salary, but we'll never get there with a single-salary schedule that would require all teachers of equal experience and degree attainment to get paid the same amount. It's just impossible."

The EdWeek quote would suggest that Springer and I do not agree on what a "high-performing teacher" looks like. Here's the quote from EdWeek that suggests to me that Springer doesn't entirely understand what he's studying:

The findings suggest that merit pay is having a pretty significant impact on student learning.

Only if you believe that Big Standardized Tests actually measure student learning-- a finding that remains unfound, an assumption that remains unproven, and an assertion that remains unsupported. My faith in their understanding of the real nature of BS Tests is further damaged by their reference to "weeks of earning." Researchers' fondness for describing learning in units of years, weeks, or days is great example of how far removed this stuff is from the actual experience of actual live humans in actual classrooms, where learning is not a featureless tofu-like slab from which we slice an equal, qualitatively-identical serving every day. In short, measuring "learning" in days, weeks, or months is absurd. As absurd as applying the same measure to researchers and claiming, for instance, that I can see that Springer's paper represents three more weeks of research than less-accomplished research papers.

Springer et al note some things they don't know in the "for further study" part of the paper.

EdWeek missed one of the big implications in the conclusion:

Teacher recruitment and retention, however, is another theoretically supported pathway through which merit pay can affect student test scores. Our qualitative review of the emerging literature on this pathway suggests that the positive effect reported in our primary studies may partly be the result of lower levels of teacher turnover. 

In other words, burning and churning doesn't help with your test scores. You know what doesn't encourage teachers to stay? Tying their pay (and job security) to the results of bad tests the results of which are more clearly tied to student background than teacher efforts. You know what else encourages teachers to stay? The knowledge that they are looking at a pay structure that at least helps them keep pace with the increases in cost of living, and not a pay structure that will swing about wildly from year to year depending on which students they end up teaching.

Springer also acknowledges a caveat parenthetically which really deserves to be in the headline:

our evidence supports the notion that opportunities to earn pay incentives can lead to improved test scores, perhaps through some increased teacher effort (or, nefariously, gaming of the performance measure system).

Yes, that nefarious gaming of the system, which in fact the remains the best and often only truly effective method of raising BS Test scores. This is a huge caveat, a giant caveat, the equivalent of saying "Our research has proven that this really works-- or that if you offer people money, some will cheat in order to get it." This research might prove something kind of interesting, or it might prove absolutely nothing at all. That deserves more than a parenthetical comment or two.

Springer's research suffers from the same giant, gaping ridiculous hole as the research that he meta-analyzed-- he assumes that his central measure measures what it claims to measure. This is like meta-analysis of a bunch of research from eight-year-olds who all used home made rulers to measure their own feet and "found" that their feet are twice as big as the feet of eight-year-olds in other country. If you don't ever check their home-made rulers for accuracy, you are wasting everyone's time.

At a minimum, this study shows that the toxic testing that is already narrowing and damaging education in this country can be given a extra jolt of destructive power when backed with money. The best this study can hope to say is that incentives encourage teachers to aim more carefully for the wrong target. As one of the EdWeek commenters put it, "Why on earth would you want to reward teachers with cash for getting higher test scores?" What Springer may have proven is not that merit pay works, but that Campbell's Law does.

[Update: Be sure to read the comments for Jersey Jazzman's explanation of just how little the numbers in this study tell us.]

1 comment:

  1. The effect size for US schools is 0.035 standard deviations. This is equivalent to moving from the 50th percentile to 51.4.

    2) Not only is converting to "days of learning" absurd - it is theoretically invalid. It's a big discussion and I'll blog on it when I can, but for now:

    The study the authors use to justify the translation -- Hill, Bloom, et al. (2008) -- specifically points out the "gain" from grade to grade is much larger in the earlier grades than in the later ones when expressed as standard deviations. Moving from K to Grade 1 in reading is 1.52 SD; from Grade 7 to 8, however, is 0.26 SD.

    This means that "3 weeks of learning" has a completely different meaning when in K than in Grade 7. So you can't just average this stuff and then plop it into your meta-analysis' conclusions.

    Worse, the tests in Hill et al. are vertically scaled, a minimal requirement when attempting to describe "gains" on a time scale. There is no indication this is true in the merit pay meta-study; I can nearly guarantee the tests used weren't all vertically scaled.

    The point is the effect of merit pay found here isn't "moderate" -- it's very small.


    Mark Weber (Jersey Jazzman)