Sally Ho, who works the Nevada/Utah beat for the Associated Press, tried her hand at a Common Core Big Standardized Test Explainer. She needed to do a little more homework, including a few more things not written by Common Core Testing flacks.
She sets out with a good question. Last year students were supposed to be taking super-duper adaptive tests that would generate lots of super-duper data. But in many states the computerized on-line testing was a giant cluster farfegnugen, and in many other states there was an "unprecedented spread of refusals."
So what is the impact of the incomplete data?
Common Core History
Ho's research skills fail her here, and she goes with the Core's classic PR line: The Core is standards, not curriculum. Also, it was totally developed by governors and state school superintendents "with the input of teachers, experts and community members." It's pretty easy to locate the actual list of people in the room when the Core was written.
Ho locates the opposition to the Core strictly in the right wing, reacting to Obama's involvement and a perceived federal overreach. Granted, she's a Nevada-Utah reporter, but at this point it's not that hard to note the large number of people all across the political spectrum who have found reasons to dislike the Core.
What Happened Last Year?
Ho's general outline is accurate, though her generous use of passive voice (the Clark County School District "was crippled") lets the test manufacturers and states off the hook for their spectacular bollixing of the on-line testing. She also notes the widespread test refusal (go get 'em, New York).
She also dips into the history of incomplete data, noting Kansas in 2014 and Wyoming in 2010. She might have spared a sentence or two to note that nothing like this has happened before because nobody has tried data generation and collection on this scale before.
How Are Test Scores Usually Used?
States are required to test all students and use their scores to determine how the school systems are doing, which can affect funding. Some states use the data for a "ratings" system. A few are using it as a part of teacher evaluations. In the classroom, schools generally share the data with teachers who use it to guide curriculum decisions and measure individual students.
True-ish, true (particularly with air quotes around "ratings"), true-ish, and false. We can call it etra false because it's not possible to effectively do all of those things with a single test. Tests are designed for a particular purpose. Trying to use them for other purposes just produces junk data.
How Will Incomplete Scores Affect the Classroom?
Ho has a wry and understated answer to this question: "Direct impacts on the classroom are likely to be minimal." I think that's a safe prediction from an instructional standpoint, though she rather blithely slides past "most states aren't using it for teacher evaluations yet," which strikes me as rather blandly vague, considering we're talking about the use of junk data to decide individual teachers' fates.
Still it's true that, since the test data never provided anything useful for classroom, having less of a useless thing doesn't really interfere with anyone's teaching. And if there's a teacher out there saying, "But how shall I design my instruction without a full Big Standardized Test Data profile of my students," that teacher needs to get out of the profession.
Ho might also have addressed the issue that in most states the data, incomplete or otherwise, doesn't arrive before the start of the school year, anyway.
She also claims that everyone says that test scores don't make the final call on grade promotion, which will come as news to all those states that have a Third Grade Reading Test Retention policy.
Oh No She Didn't
Ho answers the question of "Why even bother to test" with the hoariest of chestnuts, the Bathroom Scale Analogy-- "a school district trying to tackle chronic problems without standardize test scores can be like trying to diet without a scale." It is a dumb analogy. I have ranted about this before, so let me just quote me on this:
The bathroom scale image is brave, given the number of times folks in the resistance have pointed out that you do not change the weight of a pig by repeatedly measuring it. But I am wondering now-- why do I have to have scales or a mirror to lose weight? Will the weight loss occur if it is not caught in data? If a tree's weight falls in the forest but nobody measures it, does it shake a pound?
This could be an interesting new application of quantum physics, or it could be another inadvertent revelation about reformster (and economist) biases. Because I do not need a bathroom scale to lose weight. I don't even need a bathroom scale to know I'm losing weight-- I can see the difference in how my clothes fit, I can feel the easier step, the increase in energy. I only need a bathroom scale if I don't trust my own senses, or because I have somehow been required to prove to someone else that I have lost weight. Or if I believe that things are only real when Important People measure them.
Ho tries to hedge her bets by going on to say that of course you need other data, but the basic analogy is still just bad.
What's Next?
Studies looking at the validity of scores that states do have, which is kind of hilarious given that most of the BS Tests have never been proven valid in the first place.* So I guess states will try to find out if their partial unvalidated junk is as valid as a full truckload of unvalidated junk. That is almost as wacky as the next line:
For the next testing cycle, states say they don't expect problems.
Ho might want to check the files and see if the states expected problems this last time. You know, the time with all the unexpected problems. But Nevada has a new test manufacturer, Montana has no Plan B, and New York is leaning on parents. So everything should be awesome soon. And anyway, there's plenty of year left before it's time for the next puff piece on Common Core testing. Can I please request that AP reporters use that time to do some reading?
*I originally wrote that they have never been studied for validity; that's not true. Studies are out there. I and others remain unconvinced by them.