Teacher Eval: Waist Deep in the Big Muddy

Thomas Toch turned up in the Atlantic this morning to argue that teacher evaluation, now given a bit of freedom in the new ESSA, should stay the course.

Toch is senior partner at the Carnegie Foundation for the Advancement of Teaching, a name apparently chosen for its high degree of irony. Their emphasis is making teachers into uniform cogs in a machine that works at scale. One of their six guiding principles is "Variation in performance is the core problem to address." Their staff includes a woman with one of the absolutely best titles ever-- Director of Productive Persistence-- but their board of trustees includes many of the usual reformy suspects, including Harvard Graduate School of Education, Teach for America, and Randi Weingarten.

Toch notes that the Obama administration worked real hard to push teacher evaluation systems, even though they were opposed by the "two powerful forces" of teacher unions and Tea Party. But he is concerned that ESSA "abandoned" the work of identifying "who in the profession was doing a good job, and who wasn't."

This is a bit of a fuzzy point. There's actually a difference between trying to identify effective teachers and trying to foster teaching effectively, but Toch is going to cut several corners before we're done.

The teacher unions have dismissed the Obama strategy as ineffective, as more hurtful than helpful to the teaching profession. But over three dozen states have embraced more meaningful teacher-measurement systems under the Obama incentives, combining features like clearer performance standards, multiple classroom observations, student-achievement results and, increasingly, student surveys. 

First of all, a bicycle, because a vest has no sleeves. Toch has put two sentences side by side that have nothing to do with each other. Have teachers unions dismissed Obama's "strategy" as ineffective and more hurtful than helpful? Well, yes-- and so have a boatload of other people. So it might make sense to ask if the system is, in fact, any good. But instead Toch says, "But hey-- lots of people implemented systems of some kind."

What Toch persistently and deliberately skates around throughout the article is that the Obama plan for teacher evaluation rested almost entirely on linking teacher evaluation to student test scores through what's usually called a VAM system, and it has been debunked and rejected by everyone from the American Statistical Association to the National Association of Secondary School Principals. There is an entire blog (Vamboozled), run by a numbers scholar, devoted to debunking VAM.

Toch very carefully avoids mentioning that Obama's teacher evaluation plan was to tie teacher evaluation to the same high stakes standardized tests that have become so controversial in a system that is widely regarded to simply not work. The test score evaluation ties come up just twice-- and Toch dismisses them as if they are something far in the past and not part of current reality, and blames them on Duncan. And to prove that he's uninterested in facts and data and reputable science, he cites the National Council on Teacher Quality, an organization that has rated colleges on programs that don't exist and once critiqued college education programs based on the handouts from commencement. They are quite possibly the least serious research group in all of education, and if Toch wants to make a serious point, he should not mention them.

He refers to some other great new ideas, like teaming up master teachers with newbies which is neither a bad idea nor a new one. He touts new systems for providing teachers with personalized "playlists" of canned lessons, as if that's a good idea (it's not). He notes that lots of professional development sucks, which is news to exactly nobody. He notes that some side-effects have been stupid (gym teacher evaluated on ELA test scores), but he signals that he really doesn't get it with an oft-repeated refrain:

But it’s clear from the many new evaluation initiatives launched in recent years that well-designed evaluation systems with a mix of measures, multiple evaluators, and a strong focus on teacher improvement can strengthen instruction, make teaching more attractive work, and raise student achievement.

This is the signal fallacy, the giant gaping maw of wrong nestled in the heart of Bush-Obama teacher eval policies-- the notion that a teacher's primary job is to get students to score well on a Big Standardized Test."Student achievement" is reformspeak for "test scores," and that's simply not the most important-- probably not even An important-- part of a teacher's job. No parent in America says, "My kid has a great teacher this year," and means "My kid's teacher helped her get some really good test scores."

The Obama-era teacher evaluation systems sucked. They collected lousy information about things that aren't even the most important part of a teacher's work. They consistently proved to be unreliable and invalid. They provided no useful information to anybody. One of the few bright spots of ESSA is the end of the federally-mandated inaccurate unreliable nonsense evaluation system. Yes, many of the old-style evaluation systems were not very helpful, but the new systems actually managed to be worse by creating the illusion that real evaluating was going on, and by forcing schools to focus on unimportant baloney instead of real teaching. Toch can go wading on into the Big Muddy, but I recommend that the rest of us turn around and get back on solid ground.


  1. Let me save Toch a whole bunch of time, energy, and money.

    Every school, college, or university in America, public, private, or pirate (charter) contains roughly the same mix of teacher competencies.

    A small handful of truly excellent teachers, a very tiny handful of truly incompetent (read: "harmful") teachers; the remainder (majority) of which are satisfactory, ranging from good to mediocre. Sound familiar? Sounds like your workplace too?

    Now what about that small handful of incompetents doing more harm than good. To begin with, every principal knows who they are. And unless they inherited them, building principals interviewed them, hired them, observed them, evaluated them, and granted them due process protections (tenure). Maybe Mr. Toch would be better off trying to improve administrative management skills rather than waste all his efforts trying to cure the symptom.

    1. I said the same thing about there being great teachers, terrible teachers, with most of us in the middle on Dr. Ravitch's blog but found myself roundly criticized for making unwarranted generalizations. Good to see a closer connection to the actual world here.

      I think you might be a little hard on principals here though. In California, for example, tenure is granted in under two years, giving little opportunity to gather information about how the teacher grows with experience. Dismissing teachers after tenure is a very expensive and lengthy process, likely to result in the teacher not being suspended. I have to think that there are far more incompetent teachers than there are teachers who are sexual predators, yet the most common reason for a tenured teacher to be fired appears to involve sexual misconduct.

    2. It is possible but quite for a teacher in California to receive tenure in less than 2 years. It took me 5 years because my first 3 years were on temporary contracts and did not count toward tenure. I have never met a teacher who got tenure in less than 2 years.

      Good administrators can get rid of incompetent teachers. I have witnessed it. However, the students make life so difficult for incompetent teachers that most of them quit before approaching tenure.

    3. For California, I can only go by what I read. It may be that these days in practice people take more than 2 years.

      Good administrators perhaps can get rid of incompetent teachers, but generally they don't get rid of incompetent teachers. Dismissal hearings appear to be rarely about teaching incompetence.

      As for students making life miserable for teachers, that is likely true if the special area of incompetence is classroom management. Other areas of incompetence might actually be welcome by some students.

  3. VAM sucks, and so does checklist drive-by observations, such as Danielson. In business, rating employees is called stack ranking. Microsoft abandoned it because it is a morale killer and therefore a productivity killer. But what's good for the billionaire goose is not apparently good for the gander. Bill Gates promoted similar systems for schools.