Saturday, August 15, 2015

VAM on Trial in NY

If you don't know Sheri Lederman's name, you should. She's the New York teacher who, with her lawyer husband, dragged VAM into a courtroom this week and gave it the beatdown it so richly deserved.

Lederman's story is, at this point, the story of millions of other teachers. One year, her VAM score indicated that she was awesome. The next year, her VAM score indicated that she sucked. Not only was she pretty much the same teacher, but her students got pretty much the same scores.

Because of the importance of the case, lots of folks were there to watch. Carol Burris has a great account in The Answer Sheet, and this blog by Alexndra Milleta who has known Lederman for decades is also worth a read. Audrey Amrein-Beardsley has been following the case for a while. Diane Ravitch provides links to the pertinent documents and experts affadavits in the case.

There appear to be two issues that strike the judge in the case as dopey.

The Curve.

How do you set up an evaluation system that predetermines that some teachers must be bad? Judge Roger McDonough wants to know how you can have a fair system that starts with the premise that even if all the teachers are effective, some of the teachers are not effective. How can evaluations be evaluations is they are not actually tied to a real standard?

The Avatars

New York, like most VAM systems, bases its evaluations on imaginary students. The magical formula creates an imaginary student, an avatar, who is somehow located in an imaginary universe where a neutral teacher leads her to a particular score. If your real student does better on the test than her imaginary counterpart, congratulations-- you're a swell teacher. If your real student does only as well as, or worse that, the imaginary counterpart-- so sorry, but you suck.

This is math as magic, an attempt to do a thing which cannot be done but to convince yourself you've done it because, hey, numbers!!

It will be a month or two before the judge comes back with a ruling, and if he rules against the evaluation system, get ready for the gates of hell to open. In the meantime, the Ledermans stand as a reminder that sometimes, someone has to stand up and make a fuss, and sometimes, when you look around at the circumstances of the moment, that person turns out to be you.


  1. Peter,

    Here's another great account of the Leaderman trial from Carol Burris---with an introductory summary for the unacquainted by Valerie Strauss of the Washington Post:


    The exasperated New York Supreme Court judge, Roger McDonough, tried to get Assistant Attorney General Galligan to answer his questions. He was looking for clarity and instead got circuitous responses about bell curves, “outliers” and adjustments. Fourth-grade teacher Sheri Lederman’s VAM score of “ineffective” was on trial.

    The more Ms. Galligan tried to defend the bell curve of growth scores as science, the more the judge pushed back with common sense. It was clear that he did his homework. He understood that the New York State Education Department’s VAM system artificially set the percentage of “ineffective” teachers at 7 percent. That arbitrary decision clearly troubled him. “Doesn’t the bell curve make it subjective? There has to be failures,” he asked.

    The defender of the curve said that she did not like the “failure” word.

    The judge quipped, “Ineffectives, how about that?” Those in attendance laughed.

    Ms. Galligan preferred the term “outlier.” Those who got ineffective growth scores were “the outliers who are not doing a good job,” the attorney said. She seemed oblivious to the fourth-grade teacher who was sitting not 10 feet away from where she stood.

    “Did her students learn nothing?” Justice McDonough asked. “How could it be that she went from 14 out of 20 points to 1 out of 20 points in one year?” He noted that the students’ scores were quite good and not that different from the year before.

    Back behind the bell curve Ms. Galligan ran. As she tried to explain once again, the judge said, “Therein lies the imprecise nature of this measure.”

    I met Sheri Lederman a year before she became an “outlier.” In April of 2013...

    (and on it goes ... )

  2. This is priceless. Hats off to Sheri, and I bow down very, very low. I SO want this ridiculous, "magickal" algorithm system to get a smack-down and get booted out, never to be heard of again.

  3. I am not sure why the idea of an "imaginary student" would be so strange to teachers. When teachers assign students a grade, are they comparing that student to a particular other student in the class or some standardized notion of a student? If your real student does work that you think an excellent student would do, don't you give the student an A? If your real student does not reach a standard that you think an excellent student would reach, don't you give that student a B or C or something lower?

    1. I think in this case they're talking about an "imaginary" student who has no social, economic, emotional, physical, or family problems. And probably an imaginary teacher, too.

    2. Rebecca,

      I am talking about how teachers assign grades. When I consider giving a student an A for a class, I consider what constitutes excellent work in the classroom, constructing a standard that could easily be thought of as an "imaginary" student.

      Can I ask how you determined grade breaks in your class?

    3. In my foreign language classes, grades were made up of class participation, written work, and test scores.

    4. Rebecca,

      How did you decide what level of class participation, written work, and test scores qualified a student for a particular grade? When I am looking at student work I do not compare that work to any particular other student, but an "imaginary" student who represents a standard for A work, for B work, and occasionally for F work.

    5. I work off of a point system. As much as I don't like numbers, they have their uses. The weight of each of the three components depends on the level, French I, French II, etc., though it happens somewhat naturally depending on how things are going that year, how much written work I assign, how many days there's opportunity for them to volunteer, which is most days, how many tests I give.

      Pretty much every day they get participation points for doing everything I ask them to, but they can't get an A for the day in participation unless they volunteer. They get a point for raising their hand, even if I don;t call on them, because I can't call on everyone every day, but I try, and they don't have to have the right answer, or it can be that they ask a question, or volunteer to do something on the board. If they volunteer more than once, they can get extra points up to a certain number, and that's the only way they can get extra credit in my class. In foreign language it's important that they speak and use the language, so this encourages that, though they can ask questions in English also, and this keeps them engaged in class. Participation is weighted more in first year, because I start out with speaking and listening,

      Worksheet written practice I check in and they get points for doing it and approximately how well, because I can tell at a glance if they have the right idea. Then students put the answers on the board and we go over them and they correct their own papers. Sometimes I collect the papers to give them points for doing the corrections, and that keeps them engaged and then I know they have the right answers to study for the test, because the tests are normally similar in content and format to the written work. When they do original writing, my rubric is, would a native speaker understand what they're trying to say? and does the student show improvement in the use of vocabulary and structure?

      Their quarter grade is the number of points they earned divided by the number of points possible. A (or A-) 90-100%, etc. But it's important the grades be fair, and if they don't look quite right to me, I figure there's something wrong with my formula, and I play a little with the weighting until it looks like everybody's grade matches what they've earned by their effort and mastery. If students come to class, pay attention, do their best effort to do everything I ask, and follow my guidelines on how to study for tests, most students should get an A or a B, and it will match with their content and skill mastery.

      Tests are weighted most, but participation is their oral grade. I'm sure post-secondary and different subjects are different, but at the K-12 level we're also teaching study and learning habits, so effort counts too. I make up most of my own material and activities, using things from different materials I have when they mesh with the skills and content I'm teaching.

    6. Rebecca,

      The point totals that add up to an A or B or C or any grade are in essence your "imaginary" student that you will compare your actual students to in determining their grade. If they don't look quite right to you it is because your this student looks more like your reference "A" student than the numerical score suggests.

    7. You can call it that if you want but I certainly don't think of it that way. Any small adjustments I make to the system, up or down, is based on looking at each individual student and seeing if their grade seems commensurate with their efforts and skill level attained. And I still think the "imaginary" student referenced in Peter's post was different, a "perfect" student without any social, economic, physical, emotional or familial problems. I don't expect my students to be perfect.

    8. And I think the point of the post is that the algorithms expect an imaginary "perfect" student, which is not realistic.

    9. And, the system in question doesn't allow for any adjustments in case it ends up being unfair to some students.

  4. No, we don't. We don't have imaginary students -- we have real students. We don't need imaginary students, and we don't think about what imaginary students would do.

    1. Julie,

      So when you decide to give a student an A, it is because their work matches a particular student you have had in the past? That is certainly not how I make those decisions.

    2. What? "Decide?" "Assign?" Do you arbitrarily *give* grades to your students? I don't hear teachers using that terminology...

      Expectations are set way before the students are assessed. A realistic (not secret) bar is set, and those who work *earn* those grades. There is no imaginary, perfect student, no stereotypical anything. Instead, we have unique young human individuals trying to learn how to get along in life. The "grade" is not the teacher's personal, subjective opinion of that student, only the measure of the student's mastery of the subject being taught.

      Maybe if these reforms had been put in place by real teachers (and not for the purpose of destroying/privatizing public education) we would be dealing with realistic, research-based standards and student outcomes that actually help students make that connection between hard work and achievement. Instead, we've got teachers (hard work = success) killing themselves to meet this mysterious moving target of "highly proficient," and students who are stuck in this hopeless cycle of nonsense and turmoil as the rules are constantly changing.

      Talk to a teenager and compare your high school experiences. What made school memorable and meaningful for us has been replaced with a turnstile of staff, stress, disconnect, non-stop test prep, the almighty score. For no good reason. Unless those pulling the strings just don't like us and want to give us F's.

    3. la99,

      When you set expectations you are essentially constructing a set of "imaginary" students to compare with your actual student.

      I have talked to high school students every day for the last nine years, and their experiences were not very different from mine or my spouse. They barely notice the standardized tests as they have no impact on them. The almighty score for the student is the teacher assigned grade. It is the teacher assigned grade that can keep them from graduating from high school or deny them admission to a state college or university.

  5. Dear Peter,

    I don't know if you've had a chance to read it yet, but Jerry Muller's article "The Costs of Accountability" in this month's issue of The American Interest (see is one of the very few on this topic that I would call "seminal." Diane referred to it earlier this week in a post entitled "metric madness." Would that Judge Treu had least been familiar with this article's powerful and broad-based arguments before rendering his ill-informed decision in Vergara v. California! Hopefully Judge McDonough will prove somewhat more reasonable. And, dare we say, judicious?

    1. Great article! Thanks for the link, Jonathan.

    2. Judge Treu's mind was made up before he even walked into the courtroom on DAY ONE of the Vergara trial. He didn't just dismiss the other side... nay.... he didn't even listen to one word or argument that contradicted his belief in the Vergara plaintiff's righteousness.

      In this way, the Vergara trial was like the O.J. Simpson criminal trial... except with a judge, instead of 12 jurors.

      It will go the other way on appeal.

    3. Johnathan, it's on my In Case You Missed it list for tomorrow. It's a pretty powerful piece of work.