Sunday, May 4, 2014

NY Explains Worst VAM Ever

A reader pointed me at an extraordinary piece of educational in-house PR from our good friends at engageNY*, the uber-reformy wing of the NYSED.It's extraordinary because, if accurate, it explains clearly and simply exactly how VAM-style evaluation can be made even worse.

Carol Newman-Sharkey linked me to this short informational video. You know it's going to be fun, because it's stylish cell animation (I'm a sucker for traditional art forms), and it actually turns out to be quite easy to understand, which makes it that much more terrible.

The video wants to explain New York's student growth measurement to teachers. It starts by reminding us of a True Thing-- that the system where we were judged on a student's single context-free score absolutely sucked. You remember those days under NCLB, where we all worried about receiving students whose limitations guaranteed that they would never have a sufficiently high score.

I mean, let's be honest-- when we started to hear about the idea of a growth model, a model that gave us credit for "growing" a student's ability instead of simply marking his level, most of us were pretty okay. Oh, but the devil in those damned details. We wanted more sensible measures of our work. Instead, we got VAM.

Here's how the video explains NY's system.

We look at Pat's score this year, and we compare it to Pat's score last year. Then we look at those pair of scores, and we compare the improvement only to other apples-- to other students who are just like Pat. That means students who got the same score last year, and who have the same characteristics on NYSED's list of characteristics.

Once we've set up the group of similar apples, it's just straight percentiles. If Pat scored better than 90 students in Pat's group, Pat's SGI is 90. And then we average all the SGIs in Pat's class to get a number for Pat's teacher. Actually, engageNY seriously muddies the water here by saying that Pat's score is 90% even though it's not actually a percent of anything. Of course, if I had created a system this dumb, I'd want to keep it hidden behind a big muddy cloud, too.

There are two stupid things happening here.

Stupid Thing #1.

Somewhere in some office in Ny is an official whose job it is to determine what makes students "similar." The video references learning disability, English language learner, and socio-economic background.

If we knew exactly which characteristics influenced student learning in exactly what way across all learners, would we not be using that information to create a perfect education system? If we could say, "Yes, this much poverty causes this much difficulty with learning exactly these skills," would we not be able to correct for this in regular teaching?

This phenomenon deserves its own post, but the short version is this: We keep building dumb systems on an assumption of a particular piece of knowledge where, if we actually HAD that knowledge, we would be using it for something other than the dumb system. If we really knew exactly how certain factors effect all student learning in pretty much the same way, the last thing we'd use the knowledge for is this dumb evaluation system.

Furthermore (what a great word-- you can just hear the high dudgeon in my voice), such a system of mapping student similarities is based only on solid-state steady characteristics. It factors in "Chris has poor parents" but not "Chris didn't get to eat for twenty-four hours before The Test" and certainly not "Something made Chris really upset on Test Day."

The assumption that we can map similar students by mapping all the pertinent factors that affect their education is a dumb assumption, but it is the same dumb assumption that lies at the core of ordinary VAM foolishness. To make SGI stand out, we need another brain-impaired cherry to put on top of the nincomboobulous sundae.

Stupid Thing #2

Do you know what percentiles are when you use them like this? Stack ranking.

Stack ranking 's most notable quality is that it requires winners and losers. You might think that teaching a classroom so effectively every single student grew and learned and excelled would be a Good Thing, but in New York, you would be wrong. In New York State, if 100 "similar" students find a cure for cancer, the student whose cure works most slowly has an Student Growth ranking of zero. If you teach 100 "similar" six year olds to read and write best-selling novels, the six year old whose novel comes in lowest on the NYT best-sellers list earns a student growth score of zero. Stack Ranking by percentile creates undeserving losers.

To be fair, it also creates undeserving winners. If 100 "similar" students all fail to learn anything from, say, the sort of canned and scripted curriculum favored by engagaNY, the student who displays test mastery of the greatest amount of the least amount will still by ranked 99%.

I suspect that engageNY will claim that these issues will be evened out by using student data from all across the state. I suspect my response would be, "What?! Where the students live is not one of the factors that makes them similar?"

The video is not new, so perhaps my work here is moot. Cooler NY heads have already said, "Yeah, that's messed up. Let's rewrite this and do better." Course, the video is still live on their website, so maybe not. I kind of wonder what John Q. Parent would think about this cartoon if he saw it.

(Incidentally, wikipedia doesn't have an entry for engageny. Perhaps some enterprising NY teacher could lend a hand.)

No comments:

Post a Comment