Sunday, October 12, 2014

What We Haven't Learned from the Widget Effect

Do you remember that awesome post I wrote that totally changed the face of American education?? You don't?? Well, let me just keep mentioning that awesome post (and how it changed the face of American education) for the next five years and maybe my massive importance will start to sink in.

That's about where we are with TNTP and "The Widget Effect," a "report" I'm not going to link to for the same reason I don't mention TNTP's leader by name or provide links to pro-anorexia sites-- some things are just already taking up too much of the internet.

The Widget Effect is celebrating its fifth anniversary of its own importance. If you're unfamiliar with the "report," let me summarize it for you:

We don't pay teachers differently based on how good they are. We should do that.

That's it. Pump it up with extra verbage and slap on some high-fallutin' graphics, and you've got a "report" that other "report" writers love when they need to add some gravitas to the footnote section of their "report." As you may have heard, there's particular interest in the "We should do that" portion; TNTP is a huge fan of teacher evaluating.

TNTP has presented several anniversary evaluation commentary-paloozas, including this one that sandwiches a thoughtful Andy Smarick piece in between two large slabs of reformy baloney. But that's not where we're headed today. Today we're going to look at "4 Things We've Learned Since the Widget Effect." Let's do a little check for understanding and see if our five years of study have paid off.

Implementation Matters More Than Design

Correct! Reformsters have learned (and are still learning) that if you promise people a warm, cuddly pet and then drop an angry badger into their home, they lose interest in your promises very quickly. Further, you do not provide useful damage control by repeating, "But it's really intended to be warm and cuddly" while the badger has the children cornered and terrified on top of the credenza. Teacher evaluation has had teachers on top of the credenza for about five years, so happy anniversary, honey badger!

TNTP offers a solution best summarized as "Do it better." Sigh. In more words, the recommendation is that if you train your key people and give them time to do a better job, the badgers will be warmer and cuddlier. TNTP describes these key people with words like "Chief Academic Officers" and  "middle managers." The odd terminology leads us back to a central question-- does TNTP think the badgers are warm and cuddly, or does it just want to convince us so we'll let the badgers trash the house. I won't rule out the former, but I lean toward the latter.

Multiple Measures-- Including Data about Student Learning Growth-- Are the Way To Go

The old observation technique was a bust, TNTP says. They support this by saying that it's just common sense. So there ya go.

While the issue of evaluation remains hotly debated, multiple measures might be the one place where something resembling a consensus has emerged. That’s a positive thing we should celebrate.

Really? Which consensus would that be? There's a fairly large consensus that "including data about students learning growth" (aka VAM) is problematic because every instrument we have that claims to do it is no more reliable than having the badgers read tea leaves through a crystal ball. I'm guessing that's not the consensus being referenced.

So incorrect on the main answer. Their recommendation, however, is to have multiple observations by multiple observers. In buildings with enough administrative staff to implement it, that idea is... not stupid.

You Can't Fix Observations If Observers Don't Rate Accurately

Observations are also one of the best examples of the gap between design and implementation. If you’re concerned about the potential variability of value-added scores, you should be truly frightened by the statistical Wild West that is classroom observations. 

They're onto something here. Here's the thing about administrators-- if they are even remotely competent, they know how good their teachers are. They'll use the fancy piece of paper if you make them, but if the observation instrument tells them one thing and their brain, sense, and professional judgment tell them another, guess who wins. If you ask, "What are you going to believe-- the observation form or your own eyes?" They will go with their own senses.

Now, if your principal is a boob, or hates you for some reason, this effect is Very Bad News. Maybe you call that the statistical Wild West, but that's still better than VAM, which is a statistical black hole caught in a box with Schroedinger's cat strapped into the hold of the Andrea Doria sailing through the Sargasso Sea as it falls into the Negative Zone just as the Genesis Bomb goes off.

TNTP's solution-- easier, shorter paperwork. Because reducing a complicated human observation of complex human interactions to a short, simple checklist totally works. I suggest that TNTP staffers field test the principle by piloting a spousal observation form to be tested on evaluating their wives and husbands.

Double fail on this item.

Done Right, Teacher Evaluations Really Can Help Teachers and Students

We're going to go to the research connected to the IMPACT evaluation system in DC. And damn-- these people can't really be that confused or dopey, can they? I want to believe that they are willfully manipulative and misleading, because that would at least mean they're smart enough to understand what they're saying, and as a teacher, it makes me sad to imagine a lump of dumb this large in the world.

Okay, here's the deal. They measure a teacher's awesomeness. They give the teacher feedback on the measurement. They measure again, and the teacher proves to be more awesome. Let me see if I can illustrated why this proves almost nothing.

Chris: If you pass my test for being my awesomest friend, I will give you a dollar. Now, hold up some fingers?

Pat: Okay. How'd I do?

Chris: Bummer. If you had held up four fingers instead of three, I would have known you were my awesomest friend.

[Fifteen minutes later]

Chris: Okay, let's take the awesome friend test again. Hold up some fingers.

Pat: Cool.

Chris: You did it. Four fingers!! Here's a dollar!

[Later over supper at Chris's house]

Chris: Mom, Pat and I became much better friends this afternoon!

The IMPACT system and the attendant research are not useless. They prove that teachers can be trained to respond to certain stimuli as easily as lab rats. They do not, however, prove jack or squat about how the system "improves" teaching-- only that it improves teacher response to the system.

TNTP recommends staying the course. I recommend that TNTP release a dozen honey badgers into their offices and hold some special training meetings on top of the credenza. If the credenza is all covered up with the Widget Effect's birthday cake, just feed the cake to the badgers. Tell them they're celebrating one of the most influential reports of the last five years.


  1. To me, to say "Implementation matters more than design" is an automatic fail because that's obviously stupid. If you don't have a good design to begin with, you've got nothing. IMPACT's Teaching and Learning Framework is not bad, but they only assess some of it, and all parts of even that cannot be assessed in a random 30 minute observation, even by a "master teacher" (who decides that person is a "master teacher"?) who may or may not be in your subject area.

  2. Ask any teacher if he or she can identify the good teachers in his/her building and you get something between "yes, absolutely" and "damn right I can." But then, when it comes to teacher evaluation, I hear the constant refrain of how impossible it is to evaluate teachers correctly. Which is it?