Saturday, April 25, 2020

Defending the Future of the Big Standardized Test

What has happened to our beloved Big Standardized Test? Why do people keep picking on it? And can we lift it back up to its hallowed heights of the past? I have a report sitting in one of my tabs here that wants to answer those questions, yet somehow falls short. It's FutureEd's report The Big Test, and it is yet another attempt to repackage reformster alternate earth history. It's not super long, but I've read it so that you don't have to. Thank goodness I took my blood pressure meds today. Buckle up and let's go.

Who Are These People?

FutureEd is a project of the McCourt School of Public Policy at Georgetown University. It was founded by Thomas Toch, whose previous work included some edu-flavored thinky tanks and executive director of Independent Education, a private school network in DC, and an editor at US News. He is one more self-declared education policy expert who has apparently never taught in a K-12 classroom.

FutureEd launched a few years back, with declarations of independence and lack of bias; one more entry in the "new conversation" pageant. But its independence was all that one can expect from a group funded by the City Fund, the Waltons, and Bill and Melinda Gates. Their senior fellows are drawn from 50CAN, Bridge International Academies, Education Trust, the National Association of Charter School Authorizers, Alliance for Excellent Education, and NewSchools Venture Fund. It's a whole blooming field of Reformsters without any traditional public education advocates anywhere in sight.

Their stated mission these days-- " committed to bringing fresh energy to the causes of excellence, equity, and efficiency in K-12 and higher education." This report is part of a series of initiatives on the future of standardized testing being funded by the Bill & Melinda Gates Foundation.

So that's where this report is coming from. Let's dive in.

Setting the Stage

It's a solid dramatic opening, starting with Georgia Governor Brian Kemp  announcing plans to cut testing because, he said, "Georgia simply tests too much." How did we get here? This opening section is going to introduce a set of familiar themes that the full report plans to hammer home.

Pressure to reduce testing has come from many, often confounding sources: teachers’ unions and their progressive allies opposed to test-based consequences for schools and teachers; conservatives opposed to what they consider an inappropriate federal role in testing; suburban parents who have rallied against tests they believe overly stress their children and narrow instruction; and educators who support testing but don’t believe current regimes are sufficiently helpful given how much teaching time they consume.

Not sure what's "confounding" here other than some of the familiar inaccuracies in this list. There's the old "teachers and their unions don't want to be held accountable" trope. Conservatives upset by "what they consider" as overreach (but, you know, that's just their opinion). A nod to Arne Duncan's "white suburban moms" who don't want to find out their kids aren't so smart. Note that these parents "believe" that tests cause stress and narrow instruction-- it's just a thing they believe, for some reason.

Every critical opposition to the test is either inaccurately characterized, or carefully marked as what Those People believe. What's missing from the list is what's missing from the entire report-- there's not a shred of blame given to the tests themselves.

Testing, the report assures us, was going to be awesome. It would make sure we were getting a "return on a national investment in public education that reached $680 billion last year." It would "spur school improvement." It would ensure that needs of underserved students were being met. It would highlight achievement gaps and allow for "objective" comparison of "achievement" across all lines. It would identify needed adjustments to instructional programs. And here just four paragraphs I, is the report's first clue about the BS Test's fundamental problem-- you need different tests for different purposes, not an unfounded belief that a single test can somehow meet a dozen different goals.

What has fueled test resistance? "Union communications and lobbying campaigns, right wing media personalities, and misconceptions about the extent of state testing." Yup, the tests get a bad rap because of PR campaigns based on counterfactual stuff. And this, repeatedly, will be the guiding principle of this report-- when your beloved program is running into trouble, look everywhere for causes except at the program itself. In a classroom, when a teacher says, "My lesson is perfect but those little SOBs messed it all up," this is what we call bad teaching.

And their misconceptions are up to date. They note that Ed Secretary Betsy DeVos has waived the BS Test requirements for this school year. "The move, and the consequent loss of a year’s worth of longitudinal data, could further reduce the standing of state tests." Sigh. If you don't understand that the year's worth or longitudinal data was already gone, washed away in a pandemic tide of school closings and trauma, or if you imagined that somehow we could just pull the students back in to at least take the test so we'd have some data, as if that data wouldn't be junk, then you really don't understand the situation.

They do correctly note that ESSA is a "bulwark" against the rising call to do away with BS Tests entirely. And here's where they're headed with all this:

But a close analysis of the political landscape of standardized testing makes clear that unless a new generation of tests can play a more meaningful role in classroom instruction, and unless testing proponents can reconvince policymakers and the public that state testing is an important ingredient of school improvement and integral to advancing educational equity, annual state tests and the safeguards they provide are clearly at risk.

Yes. Yes, they are. Thank heavens.

History Lesson: The Rise Of Testing

Back in the 1960s, there was "scant information" about how well students were doing. Well, unless you count report cards and stuff. I'm not sure if the implication is that prior to the 1960s things were great, or if we just don't care that far back, or if we're avoiding the consideration of all the great things that were accomplished by people who came through that terrible system (how did we ever get to the moon?) But for bureaucratic purposes, something else was needed, particularly to see if the War on Poverty was working. So we got the National Assessment of Educational Progress (NAEP), about which we've had serious questions ever since.

So then we got accountability movements and standards movements and local school districts still kept worrying more about their local concerns than in producing satisfying data for federal bureaucrats. So every state had to set standards, and then every state had to test those standards, and those movements got us unironically stated ideas like "The increased requirements reflected a belief that for every child in America to achieve high standards, schools needed to track the learning of every student every year against those standards and be held accountable for the results" because weighing the pig helps it grow. The report does include this accurate sentence:

National leaders simply didn’t trust local educators to do the right thing for low-income students and students of color, so they tried to force them to act, in part, by imposing far more transparency and accountability via testing. 

The sentence is accurate even if you stop twelve words in. And in fairness to the feds, plenty of regions have proven themselves to be extraordinarily untrustworthy when it comes to looking out for non-wealthy non-white students. What has never been clear is how the BS Test would help. We know that test results mostly reflect socio-economic background. Nor has an underperforming school that was otherwise a secret been "uncovered" by BS Test results. One has to wonder why this accountability never extended to obvious things like states and districts that spent far less on non-white non-wealthy students than others, or legislators who refused to address either the symptoms or causes of poverty and systemic racism.

The report suggests that No Child Left Behind was "designed to shed a bright light on education inequities" but then notes that states responded by narrowing instruction to fit test subjects, by piling on "practice" or "progress" tests to check students' BS Test prospects, and by using tests that could be scored quickly and cheaply. Thing is, all of these responses were totally predictable, and were, deliberately or not, exactly what NCLB was designed to do. And all of these lessons were ignored when the feds shifted us to Common Core related PARCC/SBA/Whatever tests.

Building Backlash

So, there we were in the 90s with states holding an "indifferent commitment" to higher standards-- a moment that could have prompted the feds to ask "are these standards really higher" or "does it work to try to impose this stuff top down" or even, "why is our initiative failing"? They didn't ask those questions, but instead doubled down and resolved to fail harder.

The report marks the founding of Achieve, a group of politicians and businessmen (but not educators--never actual educators) that helped lay the groundwork for Common Core and continued to advocate for standards, testing and other reform disruptors. Funny story about Achieve-- they've just decided to close up shop.

There follows here a fun new version of the CCSS origin story-- in this one CCSSO and NGA were just finishing up their draft of the Common Core coincidentally at the same time that the Obama administration was launching Race To The Top, and so governors asked if they could be allowed to use federal funds to help implement Common Core and some aligned tests. This leaves out many parts, some of which are included in depth in this great piece by Lyndsey Layton (if you've never read it, do so now-- seriously, I'll wait) and also seems to miss the part where the RTTT and the waivers that followed sort-of kinda required states to adopt CCSS. The report says that states were "spurred in part by the prospect of federal largesse," which skips the part where states were facing the 2014 NCLB deadline requiring them to have 100% above-average standards testing results for their students or be penalized UNLESS they agreed to the Obama administration plans. So, spurred by largesse and also kind of extortiony stuff, too.

Enter the Tea Party, which didn't like the federal overreach (and which fed a crazy huge number of bizarre claims about the Core). Also, it's worth noting that not every conservative who objected to the policies was a Tea Party radical. And enter also the teachers union; the report again levels the critique that the teachers didn't like the tests of "the new accountability they represented" and a rank and file who didn't like their livelihood dependent on test results. The report seems to want to blame the opt-out movement on teachers, which is a real slap in the face to the many parent activists who actually made the movement happen. The report focuses on the lobbying and PR resistance launched by these groups; it does not consider the possibility that teachers didn't like having their evaluation linked to test scores soaked in VAM sauce because of the giant pile of evidence that the evaluation system was invalid and unreliable. The report, bizarrely, cites three books about the issue-- The Test, by Anya Kamenetz; The Testing Charade, by Daniel Koretz; and Beyond Test Scores, by Jack Schneider. It mentions these books adding fuel to the fire, but the report does not crack those books open to consider any of their criticism of the test. That would have been a wise move; Koretz does a particularly good job of laying out why the BS Tests have failed, and failed to gain fans.

Obama Retreats

Remember when Arne Duncan and the Obama administration retreated on the whole testing thing? Yeah, me neither. Duncan made some noises about how maybe testing was going too far, and how it was a Bad Thing that schools were narrowing curriculum to boost test scores, without ever considering his own policy role in those occurrences. The report mentions the cap idea-- that states make sure only 2% of class time be spent on testing-- which simply missed the point. It's the test prep that sucks up a ton of time, the narrowing of curriculum that damaged education for many students. From out here in the cheap seats, all I ever saw was Duncan/Obama trying to have it both ways, to look and sound sympathetic without ever providing any useful relief to the problem and especially to never, ever take any ownership of it.

Eventually ESSA happened, which sort of provided some relief, but still worshipped at the altar of the testing cult and added some crazy-pants ideas, like using the SAT and the ACT as the official Big Standardized Test for the school, a purpose for which they were neither designed nor suited.

What Legislators Did

One piece of actual research in this report is a look at what states introduced and enacted in the way of test-relief bills. Lots introduced, sixty-some enacted. Reducing the number of tests was most popular, with shortening the test and capping testing time right behind.

This pushback, the report confirms, has been mostly bipartisan-- sort of. While occasionally anti-testing bills have bipartisan origins, it's also true that in some areas, one party or the other is leading the charge. And they cite a bunch of lobbying, mostly by teachers unions.

What About Teachers And Their Unions

"Teachers' take on testing is complex," says the report, saying that teachers favor testing when its useful, but not when it isn't, which doesn't really seem that complex unless you are actively resisting the insight that the BS Tests have not been useful. The report's take on the unions is complex in the sense that it's, well--

Teacher union leaders are forthright about not wanting their teachers held accountable for their students’ achievement on standardized tests, and about their opposition to high-stakes school accountability more generally.

There's no footnote for this assertion, but I have yet to ever hear or read a union leader saying any version of "we don't want teachers to be held accountable..." I've heard lots of people say that the BS Tests and VAM goop (the report never gets into VAM) are a lousy way to measure teacher effectiveness, but the repeated implication that teachers are anti-accountability, which in turn implies that they are lousy slackers trying to hide their slackness-- I've heard that plenty, and I'm hearing its echo here. They do offer some Randi Weingarten quotes including the correct observation that "there was a fixation on the teachers and the consequences for the teachers rather than a fixation on what children needed."

They cite a survey from the Center on Education Research indicating that teachers like standardized tests, plus a survey from Educators for Excellence, a teacher union-alternative reform group says so, too. Education Next, another pro-reform publication found that public likes the tests, and NWEA-- a test manufacturing company-- said their research also shows all the standardized test love. No, says the report, it's just those damn unions throwing their weight around.

Time On Testing  

This point was popular with Duncan. People object to the BS Tests because they are confusing it with all those other tests.

For instance, schools use a lot of interim tests and practice tests and let's-find-out-which-kids-need-extra-prep tests, none of which are actually mandated by the feds. Which is true, but fails to understand how high stakes testing works. Imagine you lead a school band, and you know you have a major performance coming up, a performance that has high stakes for you and your musicians. If your boss says, "No big deal-- don't rehearse or prepare or anything, just hand out the music and sight read it the day of the concert," would you listen to that advice? Of course not.

A central issue of BS Testing is that proponents imagine that the test is frictionless and simple, that if a student has the skills, they can be quickly and seamlessly used anywhere else. But authentic assessment means that the assessment task closely resembles the practiced skill, and nothing resembles taking a poorly designed multiple choice test on a computer more than, well, doing that same thing. The BS Tests, just as in the days of NCLB, have been designed to be cheaply and quickly administered-- NOT to measure the things that we say we want to measure. So schools spend plenty of time practicing doing the exact things that the BS Test wants to measure.

Solutions For The Future?

Well, the report hints at liking the competency-based all-testing, all-the-time model, where we just keep hitting students with little standardized tests all through the year. Also, the ESSA allows states to putz around, so there's that. But the report is showing the same old problems. At one point they quote an expert who says

States, with the cooperation and collaboration of local districts, need to develop systems of assessment that balance the state [accountability] program with assessments that actually help kids learn.

And then later quotes another expert who says they need testing "that requires actually getting
the school-improvement side of state accountability systems right.” Plus testing has to address parents, they say, who are more interested in their child's performance than school system performance. Those are not the same thing, and they would require different tests used in different ways. As long as the dream is a single test that can serve a dozen different purposes, the BS Test will be a waste of time and money.

Hurry! Hurry!!

The report imagines a race against time, citing the "pincer movement" between the union and the Tea Party that almost hurt testing in 2015 while ESSA was coming into being. But, they say, "the support of education organizations like the Education Trust and the Council of Great City Schools won the day on Capitol Hill." Well, maybe. But it's not really a great thing when those kind of corporate reform edu-amateurs carry the day. Dismissing Diane Ravitch as a polemicist instead of listening to what she has to say is also not useful.

But the report is concerned that with reauthorization of ESSA looming in the--well, it will probably happen sooner or later-- and the tide running against testing, maybe the next version of an education bill won't have the BS Test's back. And the last paragraph of the whole thing shows a breath of honesty and then, well--

That leaves school reformers in a race against the clock to create testing systems that are more valuable to educators and parents and that offer meaningful windows into school and student performance without overwhelming teach- ers and principals. That is, they’re in a race to change the national narrative on standardized testing.

No! Not the same thing. "Come up with tests that are useful and not-sucky' is not the same as "craft some better PR to control the narrative."

So What's Missing Here? 

When you build a hammer out of jello and builders reject it as a useless tool, you do not have a PR problem or a narrative problem. You aren't the victim of lobbying by the carpenters' union or some radical Wood House Society. Your problem is that you have created a hammer that doesn't do the job it was intended to do.

The BS Test has always had a jello hammer problem, on top of claims that not only could it be used to hammer nails, but it could also drive screws and strip paint and smooth concrete and patch drywall.

The BS Test was created with a promise that it would be usable for multiple purposes, and yet it was actually created for none of them, but to be easy to administer and score. It measures what test makers think is easy to measure, not what anybody actually wants it to measure. The report is worried that losing the tests will also lose "the safeguards they provide," but all these years in, and nobody has really made a case for the safeguards. Where are the compelling stories of schools that were struggling, but then BS Test results turned them around? Because we have far more stories of how some states (looking at you, Florida) have used this accountability system to target schools for privatization, or to signal vultures that this neighborhood would be a good place to move in an edu-business (still looking at you, Florida).

If the BS Tests had generated usable, accurate data-- if they had actually been useful-- then they might have been widely embraced, despite their top-down imposition. But they were wielded as a threat ("This is how we'll catch all those terrible teachers who are ruining schools") and their data was tied to punishments, not improvements. And their data has been shown, again and again and again, to be flawed and unreliable. They have taken the broad, expansive vision of US education, to provide a strong foundation for young people to nourish their interests and abilities and build the future they dream of, and reduced it to a meagre, cramped goal-- get a good score on a math and reading standardized test. High stakes testing has forced a small, uninspiring, dim view of what schools should be.

If the folks at FutureEd are really concerned about the future of the Big Standardized Test, I suggest they stop looking everywhere but at the test itself. I suggest they listen to the critics and consider what truth those critics have to offer. If they don't like listening to teachers and parents, I can recommend fellow thinky tankers like Jay Greene, from the very reform Department of Education Reform at the University of Arkansas, who has pointed out at great length that raising test scores shows zero connection to improving student life outcomes. I suggest they consider, really honestly, what might be wrong with high stakes testing policy itself; stop treating it as a PR problem and look at it as a product problem.

The authors are sad that this year's test has been scrapped. I have more bad news-- next year's test may very well also be a waste of everyone's time, to be either canceled or to generate data from a situation so unique as to bear no comparison to any other data. At this rate, we might get used to living without the BS Test, and I haven't seen anything to make me think that would be a bad thing.

1 comment:

  1. A student can answer incorrectly on any individual test item for any one, or even a combination of, at least a dozen different *reasons.
    Simply knowing they scored wrong tells us nothing useful if the intention is to fix/improve student test taking achievement. This is why the BS test will forever be a waste of time.

    *Reasons for Wrong Answers:
    Knowledge deficit; skill deficit; vocabulary deficit; physical exhaustion; physical illness; personal stress/trauma; test taking anxiety; test taking fatigue; apathy; poorly crafted/confusing test item; bad standard; chronic absenteeism; learning disability; dyslexia; novice English language learner; poor teaching; inadequate test prep; and more . . . !