CURMUDGUCATION: PARCC

Showing posts with label PARCC. Show all posts

Thursday, March 3, 2016

TOYT Shill For PARCC

PARCC is touting two new radio spots that feature a couple of Teacher of the Year winners touting the wonderfulness of the PARCC.

The National Network of Teachers of the Year produced a "research report" last year that determined that the Big Standardized Tests are super-duper and much more better than the old state tests. Was the report legit? Weelll.....

The report was reviewed by three-- well, "experts" seems like the wrong word. Three guys. Joshua Starr was a noted superintendent in Maryland, where he developed a reputation as a high stakes testing opponent. He lost that job, and moved on to become the CEO of Phi Delta Kappa. Next, Joshua Parker was a compliance specialist with Baltimore Schools, a teacher of the year, and a current member of the reform-pushing PR operation, Education Post. And the third reviewer was Mike Petrilli, head of the Fordham Institute, a group dedicated to promoting testing, charters, etc.

The study was funded by the Rockefeller Philanthropy advisors, while the NNTOY sponsors list includes by the Gates Foundation, Pearson, AIR, ETS and the College Board-- in other words, every major test manufacturer in the country that makes a hefty living on high stakes testing.

So the study's conclusion that tests like the PARCC and the SBAC are super-excellent is not exactly a shock or surprise, and neither can it be surprise that one follow-up to the study is these two radio spots.

The teachers in the spots are Steve Elza, 2015 Illinois TOYT and applied tech (automotive trades) teacher, and Josh Parker, a-- hey! Wait a minute!! Is that? Why, yes-- it appears to be one of the reviewers of the original study. Some days I start to think that some folks don't really understand what "peer review" means when it comes to research.

Anyway, the spots. What do they say? Let's listen to Elza's spot first--

A narrator (with a fairly distinct speech impediment which-- okay, fine, but it's a little distracting at first) says that Illinois students took a new PARCC test. It was the first time tests were ever aligned with what teachers taught in the classroom! Really!! The first time ever, ever! Can you believe that? No, I can't, either. And some of the best teachers in the country did a study last year to compare PARCC to state tests. And now, 2015 Teacher of the Year, Steve Elza:

Every teacher who took part in the research came to the same conclusion-- PARCC is a test worth taking. The results more accurately measure students' learning progress and tells us if kids are truly learning or if they're just repeating memorized facts. Because PARCC is aligned to our academic standards, the best preparation for it is good classroom instruction. As a teacher, I no longer have to give my students test-taking strategies-- instead I can focus on making sure students develop strong, critical, and analytical thinking skills. Our students were not as prepared for the more rigorous coursework in college or even to start working right after high school.

Sigh. First, "truly learning" and "repeating memorized facts" are not the two possible things that a test can measure, and any teacher who is not teaching test-taking strategies is not preparing her students for the test. I'm glad Elza is no longer working on test-taking strategies in auto shop, and I'm sure he's comfortable having his skills as a teacher of automotive tradecraft based in part on student math and English standardized test scores. The claim that PARCC measures readiness for the working world is just bizarre. I look forward to PARCC claims that the test measures readiness for marriage, parenthood, and running for elected office.

The narrator returns to exclaim how helpful PARCC is, loaded with "valuable feedback" that will make sure everybody is ready for "success in school and life." Yes, PARCC remains the most magical test product ever manufactured.

So how about the other spot? Let's give a listen.

Okay, same narrator, same copy with Illinois switched out for Maryland. That makes sense. And now, teacher Josh Parker:

Every teacher who took part in the research came to the same--- hey, wait a minute!! They just had these two different teachers read from the same script! Someone (could it be the PARCC marketting department?) just put words in their mouths. Parker goes one extra mile-- right after "analytical thinking skills" he throws in "PARCC also pulled back the curtain on a long-unspoken truth" before the baloney about how students were unprepared for life. Also, Parker didn't think there was a comma after "strong."

One more sad piece of marketing for the PARCC as it slowly loses piece after piece of its market. It's unfortunate that the title Teacher of the Year has been dragged into this. The award should speak more to admirable classroom qualities than simply be a way to set up teachers to be celebrity spokespersons for the very corporations that have undercut the teaching profession.

Friday, October 16, 2015

PARCC Expectations

As states continue to brace themselves for the release of crappy PARCC scores, now is a good time to look, again, at the PARCC Levels of Student Awesomeness:

Level 1: Student did not meet expectations.

Level 2: Student partially met expectations.

Level 3: Student approached expectations.

Level 4: Student met expectations.

Level 5: Student exceeded expectations

All levels share a critical term. Expectations.

It's a well chosen word from a PR perspective. Well-chosen, but not correct. Even, kind of, a lie.

After all-- what are expectations? They are an idea you set about before the fact. I have expectations about how my food will taste, and then I taste it. I have expectations about how good a movie will be, and then I watch it.

I don't listen to a new music release and then, after I've heard it, develop some expectations about whether it will be any good.

And in Teacher 101, we all learn that our expectations of our students will shape their performance-- what we expect them to accomplish will affect what they actually accomplish. Expectations are the horse, and performance is the cart.

So if we talk about expectations on a test, that means that before students take the test, we say, "I expect that students who really know this stuff will get at least nine out of ten items correct." In fact, if we're good teachers, we share the expectations with the students so that they know where the bar is set. That way they can also set some expectations.

By talking about "expectations," test manufacturers give the impression that their tests follow a similar chronological procession. They design the test. The set expectations of the "top students will get nine out of ten correct" sort. The students take the test. We score them and see how well they met the expectations.

That, of course, is not how it works at all. The test is designed. Students take the test. We score the test. And then, we set "expectations."

And that can only possibly be true if PARCC headquarters house a time machine.

You cannot set expectations after an event has already occurred.

We need a new word, a different word, for what test manufacturers and bureaucrats do when they set cut scores and decide who does well and who does not, because they are not setting expectations. Words have meaning, and that is not what "expectations" means. They might just as easily says "Student exceeds badgers" or "Student is taller than blue." But to say "student exceeded expectations" when you had no idea what the expectations were before you handed out the test-- that's simply a lie. The use of "expectations" is a way to hide the truth of the process from parents, teachers, students and politicians.

Thursday, September 10, 2015

PARCC's Cut Scores and Need To Know

The folks at PARCC have set cut scores. You just don't need to know what they are.

The one published cut score is the one that draws the line between levels 3 and 4 ("not quite good enough" and "okee dokee"). That's set at 750 on a scale of 650 to 850. The other levels of cut scores, the projected percentages of students falling within the various troughs-- that's all secret for the time being.

There are three takeaways here for the general public.

There are no standards here

When you set an actual standard, an actual line that marks the difference between, say, ready for college and not ready for college, you set it before you do the measuring.

In my classroom, the grading scale is set before the students even take the test. In fact, before I even design the test. 70% is our lowest passing grade, and so I design a test on which someone would have to display the bare minimum of skill and comprehension to get a 70%.

The PARCC folk are saying that they will draw a line between college ready and not college ready-- but not before the test has been taken. How does that even make sense. How do you give a test saying, "This will show whether you're ready for college or not, but at this moment, we don't really know how much skill and knowledge you have to have to be ready for college."

This is the opposite of having standards. Standards mean setting the bar at six feet and saying, "You have to clear this bar to be considered a good jumper." This is saying, "We don't know what a good jump height would be, but we are going to judge you on whether you're a good jumper or not, but we're not going to put the bar up until after you jump."

Why are we setting cut scores now? Do we know the difference between a student who is college ready and one who is not? Is there some reason to believe that changes from year to year?

Transparency

We have just about reached the point where the only way PARCC could be less transparent would be for them to require students to take the test blindfolded in a dark room on computers with the monitors turned off. This has to be the worst service ever provided by a government contractor.

Useful feedback

This is why I bust a small gasket every time somebody tries to justify these tests because they provide such useful feedback to districts and classroom teachers. PARCC is providing the most useless, data-free feedback imaginable-- and the school year has already started.

Says PARCC, "Some of your students have scored a varying levels on a test that may or may not have put them on a certain level. You can't know about the questions they answered, which ones they got wrong, or what specific deficiencies they have. And we won't even tell you the simple rating (grade) we're giving them for a while yet. But go ahead and take this gaping hole where data is supposed to be, and use it to inform your instruction."

Meanwhile, PARCC is parcelling out information on a need-to-know basis, and nobody needs to know.

UPDATE

PARCC yielded to pressure and coughed up a bit more information, including the rest of the cut scores. Mercedes Schneider has the full story over at her blog.

Monday, June 22, 2015

Fox Runs PARCC PR

Fox News Sunday took a little under four minutes to provide some uncritical promotional time for PARCC, using their "Power Player of the Week" spot to let Laura Slover, PARCC CEO, push the usual PARCC baloney. It's short-- but I've watched it so that you don't have to.

Chris Wallace kicks things off by saying that Common Core was "started by governors and state education officials as a way to set standards," so we know we're entering the Feel Free To Spin Zone right off the bat, though the second half of that sentence notes that it has become controversial because of concerns over federal interference (it is) and whether or not it's the best way to teach kids (it isn't). So I guess he's acknowledging the controversy, if not teaching it. But let's go visit a group that's testing how well Common Core works.

Roll title card for PPOTW.

Cut to Slover's talking head saying that high standards are vital because high expectations will make students do better.

Explanation that PARCC is one of two state consortia for testing. This is the first of many opportunities Wallace will have to note that PARCC started out with twenty-three members and is now down to twelve, but that little market-based measure of PARCC's failure will not make it into the profile. He'll just mention the twelve state figure in passing and let it go at that.

Slover will now run the talking point about how PARCC is a new kind of test where you don't (always) bubble in the right answer, but now drag and drop the right answer, which is, you know, totally different. She also claims that the tests measure critical thinking, problem solving, and writing, and as we have seen repeatedly, that's mostly a lie. Problem solving, maybe. Writing, not in any meaningful way. Critical thinking, never.

Wallace takes a third grade test and mentions that it was "a little challenging." We see a shot of him being amazed? incredulous? that an answer is dragged and dropped instead of being clicked on (because this is how we test eight year olds' advanced mouse operating skills-- that's in the Core, right?) but no real discussoin of what the questions entailed. Nor do we ever address where the questions come from or why anyone should believe they are a good measure of anything in particular.

Next, several GOP Presidential hopefuls say mean things about Common Core, including Bobby Jindal and Ted Cruz, both of whom get sound bites about how the feds are intruding. Wallace tells Slover, "The main complaint is that this is all part of a federal takeover of local schools." And I suppose that might be the main complaint among Fox News viewers, but c'mon-- even over there word has to have come by now that a whole host of working teachers and education experts have a list of concerns about the actual quality of the standards.

Slover counters that this is a state-driven program and states make all the decisions. But Wallace says it's more complicated than that (though not to Slover, who clearly did not need to wear her big girl pants to this interview). Wallace notes that Race to the Top effectively pushed the Core on states, but he skips over the whole business of waivers; that omission seems odd, given that the Obama administration end run around the law would be just the sort of shenanigans that Fox viewers would love to get outraged about.

We'll now give a few seconds to the opt out movement. Actually, we don't acknowledge there's a movement--we just indicate that some parents choose to pull their children from the test. But not Slover-- she wants to have her young daughter take the test because "I want to be sure she's learning." Because this highly educated CEO of a testing corporation won't know whether or not her child can read or do math unless she has test results to look at. There are so many things Wallace could have done at this juncture, but even in non-confrontation mode, he could have shown us the report that PARCC provides, which basically gives a simple verbal version of a letter grade.

But we're sticking to the usual narrative, which means that besides the usual anti-fed opposition to the Core, the other group we'll mention is-- you guessed it-- the teachers unions. As we watch picketing clips, we're reminded that the union doesn't like testing because they "worry" that their members will be judged on test results. Wallace has nothing to say about that concern (not even a simple observation that the Value-Added method for doing test-based judgment has been rejected by every authority on the subject).

Instead we go back to Slover to ask her how she feels about being slammed by both the right and the left. She takes the softball and says, "We must be doing something right," with a hearty smile.

Wallace begins the wrapup by observing that PARCC is fine-tuning by doing things like making the test 90 minutes shorter next year. But, he says, Slover says the basic principle is sound. Was there a basic principle we talked about anywhere in this piece? No matter- Slover is going to now opine on the testing talking point that we haven't yet squeezed into this piece of PR fluffery yet:

For too long in this country, success has been really a function of what income level parents have and where kids grow up. We think it's critical that kids all have opportunities, whether they live in Mississippi or Massachusetts or Colorado or Ohio, they should all have access to an excellent education, and this is a step in the right direction.

I'll note one more instance of the "access" construction favored by reformsters (would you rather have access to food, or food?). But mostly I'm impressed that Slover is able to deliver all of that speech with a straight face, given that we know that the PARCC and tests like it correlate most directly to socio-economic class. It would have been nice if Wallace had asked something like, "So how, exactly, does taking a standardized test give kids access to an excellent education?" But he just pops up to note that whether or not this is a step in the right direction is debatable, which, yes, yes, it is, and as a debatable issue, it deserves some actual fact-based reporting about the sides of that debate, but Wallace just finishes the sentence by promising us that it will be a big issue for GOP candidates.

I know that the "Power Player of the Week" segments are not meant to be hard news, but this is just a three-minute advertisement for PARCC masquerading as news. A long time ago, television personalities used to pitch products in advertisements during their own programs, but they stopped doing it because it was undignified and hurt credibility. Would that modern news channels (not just Fox) would have another such epiphany.

Wednesday, June 10, 2015

Pearson Wants To Check Your Glasses

You just can't make this stuff up.

Pearson VUE is the division of the massive corporation that actually delivers tests to a computer screen near you. They are, for instance, the folks who handle the actually administration of the GED, but they also handle nursing exams and many financial industry clients.

But you don't stay on top of that industry without being on top of things. So here's a new policy that came out in a February circular from Pearson VUE:

Pearson VUE upholds a high level of security for safeguarding the testing programs offered by our exam sponsors. To maintain this high level, we are continually evaluating our technology and processes to ensure that we are adequately addressing existing and emerging security threats. New technology advancements in eyewear, such as Google Glass, camera glasses and spy glasses,and the availability of this technology have been identified as security risks.

As a result, we conducted a pilot to improve our processes to visually inspect candidate eyeglasses during the admissions process and created specific training on how to identify eyeglasses with built-in technology. The purpose of the pilot was to field test the change in process for visually inspecting all candidate glasses for built-in technology.

Yes, the next time you go to take the GED, you'll have to present your eyeglasses for inspection (though the test administrator is not to actually touch them) to determine that you are not using any spywear.

No sign yet that we'll be imposing similar security measures on students taking the PARCC, but I am now officially not going to be shocked when it happens. Because when you're protecting something as precious as proprietary test information, you just can't be too careful.

Saturday, May 16, 2015

Dan Masi: PARCC PR Explained

Well, this is pretty awesome. PARCC has a video explainer (which I guess is more high tech than a mansplainer) intended to help us all understand how the PARCC is super-awesome. It's only a few minutes long and will make what else I have to show you funnier.

Ordinarily I would watch this and deconstruct it for you shot by shot, pointing out the various moments of transparent stupid. But Dan Masi is already on it. He has fixed the audio quality so that we can hear the real message (note-- the scrolling website recommendations at the end are his, not PARCC's). As I type this, Masi's work has only been viewed six times, which is just a waste. Watch and enjoy:

How Big Is The Honesty Gap

Sooo many folks are Deeply Concerned about the Honesty Gap. Just check out twitter

Parents and educators deserve accurate data about how their students are performing in the classroom: http://t.co/FeIjLPlwZ4 #HonestyGap
— StudentsFirst (@StudentsFirst) May 14, 2015

.@EvanE4E: Gap between state expectations & NAEP confirms need for rigorous, consistent, clear standards http://t.co/P3WiuEt9a6 #HonestyGap
— Educators4Excellence (@Ed4Excellence) May 14, 2015

States are saying students are “proficient” when they're not actually well prepared. We need to fix the #HonestyGap: http://t.co/JbinzeO3aF
— CAP Education (@EdProgress) May 14, 2015

Awesome Products + Dubious Rewards = Bad Experience http://t.co/MdnibIcIZ7 #SurveySweepstakes #HonestyGap
— Customerville (@customerville) June 12, 2014

Oops! That last tweet was apparently about some other Honesty Gap.

The Gappers are repeatedly expressing concern that parents need to know the truth about how their children are doing, specifically whether or not students are ready for college. Apparently everyone in the world is lying to them. Schools and teachers are lying when they assign grades. Even college letters of acceptance are Big Fat Lies. Everyone is lying-- the only possible way to know how your child is doing is to have that child take a Big Standardized Test, and not just any BS Test, but one from our friends at PARCC or SBA. Only those profoundly honest tests will do.

I got into a twitter discussion about this because I asked why, if NAEP is the gold standard by which state tests can be measured, why do we need the state test? Because the NAEP only samples, and we need to test every single child so that parents can get feedback. Okay, I asked-- doesn't that mean that the tests are for two different purposes and therefor can't really be compared? No, they can be compared if NAEP disaggregates well. So then why can't we-- well, I don't blame the person on the other end. Trying to have a serious conversation via twitter is like having sex by semaphore.

I gather that proof of state honesty would be more students failing, because once again we have an argument that starts with, "We know states suck at education and that students are doing terribly, so we just need to design an instrument that will catch them sucking." It's so much easier to design the right scientific measure if you already know what the answer is supposed to be.

So where is the actual honesty gap?

Is it where Common Core promoters claim that the standards are internationally benchmarked? Is it when CCSS fans suggest that having educational standards lead to national success? Is it when they decry low US test scores without noting that the US has been at the bottom of international test results as long as such things have existed?

Is the honesty gap in view when these folks say that parents need transparent and clear assessments of their children's standing, but what they mean is the kind of vague, opaque reports proposed? You know-- the report that basically gives the child a grade of A, B, C or D on a test whose questions nobody is allowed to see or discuss? Is the honesty gap cracking open even wider every time somebody suggests that a single math-and-reading test can tell us everything we need to know about a child's readiness for college and career?

Are we looking into the abyss of the gap when advocacy groups fail to mention that they are paid to support testing and the Core, or that they stand to make a ton of money from both? Does the honesty gap yawn widely when these folks fail to state plainly, "We think the world would be a better place if we just did away with public education, and we work hard to help make that happen." Is Arne Duncan's voice echoing hollowly from the depths of Honesty Gap Gulch when he suggests that telling an eight-year-old that she's on the college track either can or should be a thing?

It is ballsy as hell for the reformsters, who have been telling lie after lie to sell the CCSS-testing combo for years (oh, remember those golden days of "teachers totally wrote the Common Core"?), to bring up concerns about honesty. I admire their guts; just not their honesty.

They have a hashtag (because, you know, that's how all the kids get their marketing done these days) and I encourage to use it to add your own observations about where the #HonestyGap actually lies.

Saturday, March 14, 2015

Pearson Proves PARCC Stinks

When I was in tenth grade, I took a course called Biological Sciences Curriculum Studies (BSCS). It was a course known for its rigor and for its exceedingly tough tests.

The security on these tests? Absolutely zero. We took them as take-home tests. We had test-taking parties. We called up older siblings who were biology majors. The teacher knew we did these things. The teacher did not care, and it did not matter, because the tests required reasoning and application of the basic understanding of the scientific concepts. It wasn't enough, for instance, to know the parts of a single-celled organism-- you had to work out how those parts were analogous to the various parts of a city where the residents made pottery. You had to break down the implications of experimental design. And as an extra touch, after taking the test for a week outside of class, you had to take a different version of the same test (basically the same questions in a different order) in class.

Did people fail these zero-security take home tests? Oh, yes. They did.

I often think of those tests these days, because they were everything that modern standardized test manufacturers claim their tests are.

Test manufacturers and their proxies tell us repeatedly that their tests require critical thinking, rigorous mental application, answering questions with more than just rote knowledge.

They are lying.

They prove they are lying with their relentless emphasis on test security. Teachers may not look at the test, cannot so much as read questions enough to understand the essence of them. Students, teacher, and parents are not allowed to know anything specific about student responses after the fact (making the tests even less useful than the could possibly be).

And now, of course, we've learned that Pearson apparently has a super-secret cyber-security squad that just cruises the interwebs, looking for any miscreant teens who are violating the security of the test and calling the state and local authorities to have that student punished(and, perhaps, mounting denial of service attacks on any bloggers who dare to blog about it).

This shows a number of things, not the least of which is what everyone should already have know-- Pearson puts its own business interests ahead of anything and everything.

But it also tells us something about the test.

You know what kind of test need this sort of extreme security? A crappy one.

Questions that test "critical thinking" do not test it by saying, "Okay, you can only have a couple of minutes to read and think about this because if you had time to think about it, that wouldn't be critical thinking." A good, solid critical thinking question could take weeks to answer.

Test manufacturers and their cheerleaders like to say that these tests are impervious to test prep-- but if that were true, no security would be necessary. If the tests were impervious to any kind of advance preparation aimed directly at those tests, test manufacturers would be able to throw the tests out there in plain sight, like my tenth grade biology teacher did.

A good assessment has no shortcuts and needs no security. Look at performance-based measures-- no athlete shows up at an event and discovers at that moment, "Surprise! Today you're jumping over that bar!"

Authentic assessment is no surprise at all. It is exactly what you expect because it is exactly what yo prepared for, exactly what you've been doing all along-- just, this time, for a grade.

Big Stupid Test manufacturers insist that their test must be a surprise, that nobody can know anything about it, is a giant, screaming red alarm signal that these tests are crap. In what other industry can you sell a customer a product and refuse to allow them to look at it! It's like selling the emperor his new clothes and telling him they have to stay in the factory closet. Who falls for this kind of bad sales pitch? "Let me sell you this awesome new car, but you can never drive it and it will stay parked in our factory garage. We will drive you around in it, but you must be blindfolded. Trust us. It's a great car." Who falls for that??!!

The fact that they will go to such extreme and indefensible lengths to preserve the security of their product is just further proof that their product cannot survive even the simplest scrutiny.

The fact that product security trumps use of the product just raises this all to a super-kafka-esque level. It is more important that test security be maintained than it is that teachers and parents get any detailed and useful information from it. Test fans like to compare these tests to, say, tests at a doctor's office. That's a bogus comparison, but even if it weren't, test manufacturers have created a doctors office in which the doctor won't tell you what test you're getting, and when the test results come back STILL won't tell you what kind of test they gave you and will only tell you whether you're sick or well-- but nothing else because the details of your test results are proprietary and must remain a secret.

Test manufacturers like Pearson are right about one thing-- we don't need the tests to know how badly they suck, because this crazy-pants emphasis on product security tells us all we need to know. These are tests that can't survive the light of day, that are so frail and fragile and ineffectual that these tests can never be tested, seen, examined, or even, apparently, discussed.

Test manufacturers are telling us, via their security measures, just how badly these tests suck. People just have to start listening.

Pearson Is Big Brother

You've already heard the story by now-- Pearson has been found monitoring students on social media in New Jersey, catching them tweeting about the PARCC test, and contacting the state Department of Education so that the DOE can contact the local school district to get the students in trouble.

You can read the story here at the blog of NJ journalist Bob Braun. Well, unless the site is down again. Since posting the story, Braun's site has gone down twice that I know of. Initially it looked like Braun had simply broken the internet, as readers flocked to the report. Late last night Braun took to facebook to report that the site was under attack and that he had taken it down to stop the attack. As I write this (6:17 AM Saturday) the site and the story are up, though loading slowly.

The story was broken by Superintendent Elizabeth Jewett of Watchung Hills Regional High School district in an email to her colleagues. But in contacting Jewett he has learned that she confirmed three instances in which Pearson contacted the NJDOE to turn over miscreant students for the state to track down and punish. [Update: Jewett here authenticates the email that Braun ran.]

Meanwhile, many alert eyes turned up this: Pearson's Tracx, a program that may or may not allow the kind of monitoring we're talking about here.

Several thoughts occur. First, under exactly whose policy are these students to be punished. Does the PARCC involve them taking the same kind of high security secrecy pledge that teachers are required to take, and would such a pledge signed by a minor, anyway?

How does this fit with the ample case law already establishing that, for instance, students can go on line and create websites or fake facebook accounts mocking school administrators? They can mock their schools, but they have to leave important corporations alone?

I'm also wondering, again, how any test that requires this much tight security could not suck. Seriously.

How much of the massive chunk of money paid by NJ went to the line item "keep an eye on students on line?"

Granted, the use of the word "spying" is a bit much-- social media are not exactly secret places where the expectation of privacy is reasonable or enforceable, and spying on someone there is a little like spying on someone in a Wal-mart. But it's still creepy, and it's still one more clear indicator that Pearson's number one concern is Pearson's business interests, not students or schools or anything else. And while this is not exactly spying, the fact that Pearson never said a public word about their special test police cyber-squad, not even to spin it in some useful way, shows just how far above student, school, and state government they envision themselves to be.

Pearson really is Big Brother-- and not just to students, but to their parents, their schools, and their state government. It's time to put some serious pressure on politicians. If they're even able to stand up to Pearson at this point, now is the time for them to show us.

Tuesday, March 3, 2015

The Heavy Federal Hand

Chicago Public Schools caved.

The district's CEO Barbara Byrd-Bennett was holding out for a limited rollout of the PARCC, administering the widely unloved Big Standardized Mess of a Test to only 10% of CPS students. But the Chicago system has backed down.

It has not backed down because leaders saw the error of their ways. There was no 11th hour meeting in which test designers hunkered down with school officials to show them how the test is actually swell. There was no last-minute visit from educational experts to help Chicago schools see how the PARCC has great educational advantages and will serve the needs of Chicago students.

There were just threats. Threats from Arne Duncan's Department of Education. Threats from the federal government.

Duncan's USED likes to adopt a stance that they are just uninvolved bystanders in the Great Ed Reform Discussion. Common Core and the other reformster programs like charter boosting and Big Standardized Tests were voluntarily adopted by the states. Says Duncan's office, "Federal overreach wielding a big fat stick? Moi?? Surely vous jests."

But just as Dolores Umbridge occasionally snaps and drops her cheery facade to reveal the raging control freak underneath, the USED occasionally puts its foot down and demands obedience, or else.

They did it to Washington State when legislators refused to install a teacher evaluation program that Duncan approved of. And now they've done it to Chicago schools.

"Give the test we want, the way we want it given," comes word from DC, "Or we will take away $1.4 billion from your system. Do as we say, or the big stick comes out."

And so CPS folded, and I can't say that I blame them. Taking a stand is a great thing, but making he students of your district take a $1.4 billion dollar cut to do it is a heck of a big stand to take, and probably not responsible behavior for district leaders.

Was their principled stand a waste? Not at all. For one thing, people have seen one of America's largest school systems cast a huge vote of No Confidence in the Big Standardized Test. For another, Americans have one more chance to see the heavy hand of the feds revealed again. There's no pretending that anything happened here other than federal extortion-- do as we say, or we cut you. It's one more clear picture of where modern ed reform really came from and what really keeps it alive, and it's one more motivator for Congress to get ESEA rewritten.

It is true that the meanest, craziest person in the room gets to control the conversation. But they can only do it by revealing how mean and crazy they are, and in the long run that earns them neither friends nor allies. To use their heavy hand, they had to show their true face. They may win the battle, but they position themselves badly for the war.

Sunday, February 22, 2015

Russ Walsh: Checking the PARCC and SBA

Russ Walsh is a reading specialist who also maintains a mighty fine blog. While Russ is always worth reading, over the last two weeks he has produced a series of posts that you need to bookmark, store, steal, link--whatever it is you do with posts that you want to be able to use a reference works in the future.

Walsh ventured where surprisingly few have dared to tread. He looked at the readability levels of the Two Not-As-Big-As-They-Used-To-Be tests-- the PARCC and the SBA.

The PARCC came first, and he took three pieces to do it justice.

In Part I, Walsh looks at readability levels of the PARCC reading selections, using several of the standard readability measures. That's no small chunk of extra homework to self-assign, but the results are revealing. Walsh finds that most of the selections are significantly above the stated grade level, the very definition of frustration level. Not a good way to scare up useful or legitimate data.

In Part 2, Walsh looks at readability levels of PARCC questions, looking at the types of tasks they involve and what extra challenges they may contain. Again, some serious homework and analysis here. Walsh finds the PARCC questions wanting in this area as well.

In Part 3, Walsh goes looking into PARCC from the standpoint of the reader. Does the test show a cultural bias, or favor students with a particular body of prior knowledge? That would be a big fat yes on both. Plus, the test involves some odd choices that add extra roadblocks for readers.

Walsh followed this series up with a post looking at the SBA. In some ways this was the most surprising post, because Walsh finds the SBA test.... not so bad. While we may think of PARCC (by Pearson) and SBA (by AIR) as Tweedledee and Tweedledum, it appears that what we actually have is Tweedledee and Bob.

These posts are literate, rational, and professional (everything that my feisty but personal reading of PARCC was not) and consequently hugely useful. This is hard, solid analysis presented clearly and objectively, which makes these posts perfect for answering the questions of civilians and administrators alike. I have been reading Russ Walsh for a while, and he never disappoints, but these four posts belong in some sort of edublogger hall of fame. Read them!

PARCC Loves Monsanto?

It's been two weeks since I ploughed through the PARCC sample test items, and the swelling in my brain has mostly subsided. But there has been one thing that has nagged at me ever since, and today I'd like to revisit that.

The first set of questions deal with a selection about the use of DNA with crops-- not, as you might guess, strictly with developing better crops through genetic manipulation, which is its own kettle of two-headed fish, but through something else...

DNA testing, the technique which has helped solve high-profile murder cases, may now help to solve crop crimes.

You might well ask-- what the hell is a crop crime? Did somebody find a bunch of cows pummeled to death with no evidence except traces of corn stalks? Are there unsolved bank robberies out there where the only clue is a small pile of wheat? The selection doesn't provide much of a hint, other than to mention theft in passing.

But for several years I've had my students read Fast Food Nation and we follow it up with Food, Inc-- so the idea of a crop crime definitely rang a bell. Here's a clip from the film.

Because Monsanto owns certain crops, it reserves the right to track down anyone they think might be using their patented seeds without having paid for it. This would include someone who has had GMO pollen blown into their field by the wind.

But of course corn and soybeans just look like corn and soybean. If Monsanto thought you had grabbed some of their DNA, how would they prove it (so they could take you to court and stomp on you)? They would need some DNA testing to catch you perpetrating this crop crime.

PARCC has been criticized for including "product placement" in its testing, with brand names and logos included in the questions. But this is even creepier-- a selection that includes a whole corporate philosophy. The issues here are huge and difficult and complex-- Should a corporation own a life form, or the DNA of a life form? Should the legal system let itself be used as corporate cops? Does our need for plentiful food justify extra protections for food corporations? And that's before we get into How the Justice System Works questions.

But the PARCC question slips right past that and buries a host of challenging assumptions in this reading test. For my money (and hey-- I'm a taxpayer, so it is my money), this is far creepier than the root beer logo, and adds a whole extra problematic level for students who are knowledgeable about the issues the reading selection blithely raises.

Maybe it's simply that Monsanto has done its job so well that PARCC writers included the selection without question. Or maybe this is just how the corporate club helps keep its own point of view out there. But for me it's just one more huge PARCC fail.

Saturday, February 21, 2015

No National Test

As fans of test-driven accountability (as well as test-generated profits) continue to argue vigorously for the continued repeated use of Big Standardized Testing, there is one argument you won't hear much any more.

Today, there is no easy and rigorous way to compare the performance of individual students or schools in different states....If students take the same assessment under the same conditions, a given score in one place has the same meaning as it does in all others.

That's a from a joint paper issued by ETS, Pearson, and the College Board back in 2010. Back in 2011, USED's National Center for Educational Statistics released a report complaining that fifty different states had fifty different measures of student achievement.

The dream of Common Core was that every state would be studying the same thing. A student in Idaho could move to Alabama and pick up math class right where he left off, and the only way to insure that was going to be that Idaho and Alabama would be measuring their students with the same yardstick. Schools and students would be comparable within and across state boundaries.

That is not going to happen.

The attempt to create a national assessment is a failure. States continue to abandon the SBA and the PARCC; SBA is down to twenty-ish states and PARCC is under a dozen. The situation is messy that I have to give you approximations because it depends on who's counting and when-- Mississippi just pulled out and several other states are eagerly eying the exits and I can't find any listing of in's and out's that is reliable and up-to-date. (And that is before we even talk about how many students within testings states will opt out of their test.)

But what's important is this-- whether the number of states participating is a little over thirty or a little under, it is not fifty. It is not close to fifty. And to the extent that the number is changing, it is not moving toward fifty.

Now, granted, the number is also a bit of a lie. As with the Common Core standards, several states have abandoned the national assessments in name only. Utah, for instance, dropped out of the SBAC, and then promptly hired the same company to produce their new non-SBA test as was producing the SBA test itself. Pennsylvania dropped out of the PARCC, and yet our new tests are very, very PARCC-like.

So many states are, in fact, quietly sticking close to the beloved national assessment-- but because they are politically unlikely to ever admit it, the damage is the same for the lovers of national assessment, because the anti-nationalist states won't allow themselves to become part of the national testing.

Of course, if we wanted a national testing program, we could always go back to paying attention to the NAEP, but it's due for an upgrade and in today's climate, it's hard to imagine how such a job could be done. And it's a pre-existing product, so it certainly doesn't represent a new opening into the testing market. The current test-driven accountability wave has driven billions (with a b) of dollars into test corporation coffers. But the dream of one simple open market has fallen apart. Pearson and AIR and the rest have been forced to do business the old, messy way.

So we can't compare the students of Idaho to the students of Florida. We can't stack-rank the schools of Pennsylvania against the schools of Texas. We cannot measure how the Common Core is doing in every corner of the nation. There is no national, common assessment, and there never will be. On this point, at least, the reformsters have failed.

The PARCC Fairy Tale

The fairy tale surrounding PARCC and the other Big Standardized Tests has been tweaked and rewritten and adapted, but some folks still enjoy telling it, and every once in a while I come across (like the brothers Grimm searching the countryside for classic old material) a particularly simple and straightforward version of the old classic. That's what we're looking at today.

Andrea Townsend describes her job as coordinating services for students with special needs in the schools of Greenville, Ohio (northwest of Dayton), but her LinkdIn profile shows a broader range of responsibilities (like food service). She was previously an elementary principal, and before that nine years as an intervention specialist.She started her career as a satellite instructor connected to a vocational school for three years. She has a bachelors in Vocational Agriculture Education and a Masters in Educational Leadership.

Townsend thinks the PARCC is getting a bad rap, and she took to a community website to share that view in a piece that was later picked up by some other regional media.

I feel the need to make an unpopular statement of my opinion. Here goes… I support the new statewide tests.

So she knows she's out on a limb here. Her piece provides a testament to the mis-information that still persists and the false narrative that reformsters are still trying to sell.

Educators and legislators in our state adopted new standards to guide the instruction for public schools several years ago. These standards are focused on the skills students need to be successful in college or their career or both. The standards look at critical thinking and problem solving skills as well as developing a student’s ability communicate clearly. These skills are paramount to success in our ever changing, global and technology driven world.

Chapter One of the Tale of Test-Driven Accountability remains the same. "Once upon a time, we adopted the magical Common Core." You'll note that even though Townsend is willing to be controversial and unpopular, she's not crazy enough to promote the Common Core by name, but she does support it with the usual unproven assertions. How does anyone know that the standards cover objectives needed for career or college success? "The standards look at critical thinking"? I looked at a zoo once; that doesn't make me an elephant. Nor do I see any standards that address communicating clearly. Nor do we have a whit of evidence of exactly what skills are paramount to success.

According to the PARCConline.org website, “The new tests also are being developed in response to the longstanding concerns of educators, parents and employers who want assessments that better measure students’ critical-thinking and problem-solving skills and their ability to communicate clearly.”

Come on, Ms. Townsend-- you're better than this. According to Budwesier ads, drinking beer will make me attractive to hot blondes. According to Tony the Tiger, Frosted Flakes will make me great. As an administrator, you've had to deal with numerous vendors-- when they're trying to sell you something, do you just take their word for it, or do you check things out and verify? PARCC is just a big test vendor. Do you have any proof of their test's awesomeness beyond their own word?

Next she raises the issue of a diverse student population, specifically considering students with special needs. Again, with no back-up other than a quote from PARCC, she asserts that PARCC totally handles a wide range of students-- without ever altering the content. PARCC just allows for different ways to interact with the test, but it is great for assessing students at the far reaches of the scale-- which is really difficult to do. Much has been written about the inadequacy of PARCC's accommodations (here's one example), so we'll need more than just PARCC's word for it here, too.

Acquiring skills begins with a clear understanding of two things. First we must clearly understand what skill we want. Second we must clearly understand the skills we already have. When we have those two pieces of information, we are able to learn, practice and apply skills between those we have and those we want. It is important in education that we have the clearest understanding of the skills each student has and the skills each student needs.

Chapter Two of the Tale includes the story of how the magical PARCC will let us know exactly what our students do and don't know. Again, we know this because PARCC says so. But the PARCC is not a formative assessment, and its results are neither fine-grained enough nor quickly returned enough nor transparent enough (remember, teachers aren't allowed to so much as look at the test questions) to help any teacher-- certainly not to give the kind of help that a teacher gets from her own assessmenbts and data in the classroom.

Change is hard, says Townsend. And some of the process of change has been problematic. But she still supports the PARCC. And she has a quote from somebody's facebook page to back that up.

The lead line says that Townsend wrote this with the support of Greenville City School's Central Office, so it's unclear exactly how much this represents the district's point of view. But It does represent the fairy tale that continues to be the supporting narrative for PARCC:

Common Core Standards are magical and will make all students ready for college and career. To know if they're really acquiring those skills, we must have a magical test that can measure exactly how skilled each student has become, so that teachers can fine tune their instruction. The PARCC is that test.

That's the story, and every single sentence of it is riddled with unproven, unsupported assertions. Townsend has given us a fairly straightforward retelling of the classic, but it still rests on magical standards, magical testing, and magical thinking.

Tuesday, February 10, 2015

Sorting the Tests

Since the beginnings of the current wave of test-driven accountability, reformsters have been excited about stack ranking-- the process of sorting out items from the very best to the very worst (and then taking a chainsaw to the very worst).

This has been one of the major supporting points for continued large-scale standardized testing-- if we didn't have test results, how would we compare students to other students, teachers to other teachers, schools to other schools. The devotion to sorting has been foundational, rarely explained but generally presented as an article of faith, a self-evident value-- well, of course, we want to compare and sort schools and teachers and students!

But you know what we still aren't sorting?

The Big Standardized Tests.

Since last summer the rhetoric to pre-empt the assault on testing has focused on "unnecessary" or "redundant" or even "bad" tests, but we have done nothing to find these tests.

Where is our stack ranking for the tests?

We have two major BSTs-- the PARCC and the SBA. In order to better know how my child is doing (isn't that one of our repeated reasons for testing), I'd like to know which one of these is a better test. There are other state-level BSTs that we're flinging at our students willy-nilly. Which one of these is the best? Which one is the worst?

I mean, we've worked tirelessly to sort and rank teachers in our efforts to root out the bed ones, because apparently "everybody" knows some teachers are bad. Well, apparently everybody knows some tests are bad, so why aren't we tracking them down, sorting them out, and publishing their low test ratings in the local paper?

We've argued relentlessly that I need to be able to compare my student's reading ability with the reading ability of Chris McNoname in Iowa, so why can't I compare the tests that each one is taking?

I realize that coming up with a metric would be really hard, but so what? We use VAM to sort out teachers and it has been debunked by everyone except people who work for the USED. I think we've established that the sorting instrument doesn't have to be good or even valid-- it just has to generate some sort of rating.

So let's get on this. Let's come up with a stack-ranking method for sorting out the SBA and the PARCC and the Keystones and the Indiana Test of Essential Student Swellness and whatever else is out there. If we're going to rate every student and teacher and school, why would we not also rate the raters? And then once we've got the tests rated, we can throw out the bottom ten percent of them. We can offer a "merit bonus" to the company that made the best one (and peanuts to everyone else) because that will reward their excellence and encourage them to do a good job! And for the bottom twenty-five percent of the bad tests, we can call in turnaround experts to take over the company.

In fact-- why not test choice? If my student wants to take the PARCC instead of the ITESS because the PARCC is rated higher, why shouldn't my student be able to do that. And if I don't like any of them, why shouldn't I be able to create a charter test of my own in order to look out for my child's best interests? We can give every student a little testing voucher and let the money follow them t whatever test they would prefer to take from whatever vendors pop up.

Let's get on this quickly, because I think I've just figured out to make a few million dollars, and it's going to take at least a weekend to whip up my charter test company product. Let the sorting and comparing and ranking begin!

Sunday, February 8, 2015

Sampling the PARCC

Today, I'm trying something new. I've gotten myself onto the PARCC sample item site and am going to look at the ELA sample items for high school. This set was updated in March of 2014, so, you know, it's entirely possible they are not fully representative, given that the folks at Pearson are reportedly working tirelessly to improve testing so that new generations of Even Very Betterer Tests can be released into the wild, like so many majestic lion-maned dolphins.

So I'm just going to live blog this in real-ish time, because we know that one important part of measuring reading skill is that it should not involve any time for reflection and thoughtful revisiting of the work being read. No, the Real Readers of this world are all Wham Bam Thank You Madam Librarian, so that's how we'll do this. There appear to be twenty-three sample items, and I have two hours to do this, so this could take a while. You've been warned.

PAGE ONE: DNA

Right off the bat I can see that taking the test on computer will be a massive pain in the ass. Do you remember frames, the website formatting that was universally loathed and rapidly abandoned? This reminds me of that. The reading selection is in its own little window and I have to scroll the reading within that window. The two questions run further down the page, so when I'm looking at the second question, the window with the selection in it is halfway off the screen, so to look back to the reading I have to scroll up in the main window and then scroll up and down in the selection window and then take a minute to punch myself in the brain in frustration.

The selection is about using DNA testing for crops, so fascinating stuff. Part A (what a normal person might call "question 1") asks us to select three out of seven terms used in the selection, picking those that "help clarify" the meaning of the term "DNA fingerprint," so here we are already ignoring the reader's role in reading. If I already understand the term, none of them help (what helped you learn how to write your name today?), and if I don't understand the term, apparently there is only one path to understanding. If I decide that I have to factor in the context in which the phrase is used, I'm back to scrolling in the little window and I rapidly want to punch the test designers in the face. I count at least four possible answers here, but only three are allowed. Three of them are the only answers to use "genetics" in the answer; I will answer this question based on guesswork and trying to second guess the writer.

Part B is a nonsense question, asking me to come up with an answer based on my first answer.

PAGE TWO: STILL FRICKIN' DNA

Still the same selection. Not getting any better at this scrolling-- whether my mouse roller scrolls the whole page or the selection window depends on where my cursor is sitting.

Part A is, well... hmm. If I asked you, "Explain how a bicycle is like a fish," I would expect an answer from you that mentioned both the bicycle and a fish. But PARCC asks how "solving crop crimes is like solving high-profile murder cases." But all four answers mention only the "crop crime" side of the comparison, and the selection itself says nothing about how high-profile murder cases are solved. So are students supposed to already know how high-profile murder cases are solved? Should they assume that things they've seen on CSI or Law and Order are accurate? To answer this we'll be reduced to figuring out which answer is an accurate summary of the crop crime techniques mentioned in the selection.

This is one of those types of questions that we have to test prep our students for-- how to "reduce" a seemingly complex question to the simpler question. This question pretends to be complex; it is actually asking, "Which one of these four items is actually mentioned in the selection?" It boils down to picky gotcha baloney-- one answer is going to be wrong because it says that crop detectives use computers "at crime scenes"

Part B.The old "which detail best supports" question. If you blew Part A, these answers will be bizarrely random.

PAGE THREE: DNA

Still on this same damn selection. I now hate crops and their DNA.

Part A wants to know what the word "search" means in the heading for the final graph. I believe it means that the article was poorly edited, but that selection is not available. The distractor in this set is absolutely true; it requires test-taking skills to eliminate it, not reading skills.

Part B "based on information from the text" is our cue (if we've been properly test prepped) to go look for the answer in the text, which would take a lot less time if not for this furshlugginer set up. The test writers have called for two correct answers, allowing them to pretend that a simple search-and-match question is actually complex.

PAGE FOUR: DNA GRAND FINALE, I HOPE

Ah, yes. A test question that assesses literally nothing useful whatsoever. At the top of the page is our selection in a full-screen width window instead of the narrow cramped one. At the bottom of the page is a list of statements, two of which are actual advantages of understanding crop DNA. Above them are click-and-drag details from the article. You are going to find the two advantages, then drag the supporting detail for each into the box next to it. Once you've done all this, you will have completed a task that does not mirror any real task done by real human beings anywhere in the world ever.

This is so stupid I am not even going to pretend to look for the "correct" answer. But I will remember this page clearly the next time somebody tries to unload the absolute baloney talking point that the PARCC does not require test prep. No students have ever seen questions like this unless a teacher showed them such a thing, and no teacher ever used such a thing in class unless she was trying to get her students ready for a cockamamie standardized test.

Oh, and when you drag the "answers," they often don't fit in the box and just spill past the edges, looking like you've made a mistake.

PAGE FIVE: FOR THE LOVE OF GOD, DNA

Here are the steps listed in the article. Drop and drag them into the same order as in the article. Again, the only thing that makes this remotely difficult is wrestling with the damn windows. This is a matching exercise, proving pretty much nothing.

PAGE SIX: APPARENTLY THIS IS A DNA TEST TEST

By now my lower-level students have stopped paying any attention to the selection and are just trying to get past it to whatever blessed page of the test will show them something else.

Part A asks us to figure out which question is answered by the selection. This is one of the better questions I've seen so far. Part B asks which quote "best" supports the answer for A. I hate these "best" questions, because they reinforce the notion that there is only one immutable approach for any given piece of text. It's the very Colemanian idea that every text represents only a single destination and there is only one road by which to get there. That's simply wrong, and reinforcing it through testing is also wrong. Not only wrong, but a cramped, tiny, sad version of the richness of human understanding and experience.

PAGE SEVEN: SOMETHING NEW

Here comes the literature. First we get 110 lines of Ovid re: Daedelus and Icarus (in a little scrolling window). Part A asks which one of four readings is the correct one for lines 9 and 10 (because reading, interpreting and experiencing the richness of literature is all about selecting the one correct reading). None of the answers are great, particularly if you look at the lines in context, but only one really makes sense. But then Part B asks which other lines support your Part A answer and the answer here is "None of them," though there is one answer for B that would support one of the wrong answers for A, so now I'm wondering if the writers and I are on a different page here.

PAGE EIGHT: STILL OVID

Two more questions focusing on a particular quote, asking for an interpretation and a quote to back it up. You know, when I say it like that, it seems like a perfectly legitimate reading assessment. But when you turn that assessment task into a multiple choice question, you break the whole business. "Find a nice person, get married and settle down," seems like decent-ish life advice, but if you turn it into "Select one of these four people, get married in one of these four ceremonies, and buy one of these four houses" suddenly it's something else.

And we haven't twisted this reading task for the benefit of anybody except the people who sell, administer, score and play with data from these tests.

PAGE NINE: OVID

The test is still telling me that I'm going to read two selections but only showing me one. If I were not already fully prepped for this type of test and test question, I might wonder if something were wrong with my screen. So, more test prep required.

Part A asks what certain lines "most" suggest about Daedelus, as if that is an absolute objective thing. Then you get to choose what exact quotes (two, because that makes it more complex) back you up. This is not constructing and interpretation of a piece of literature. Every one of these questions makes me angrier as a teacher of literature and reading.

PAGE TEN: ON TO SEXTON

Here's our second poem-- "To a Friend Whose Work Has Come To Triumph." The two questions are completely bogus-- Sexton has chosen the word "tunneling" which is a great choice in both its complexity and duality of meaning, a great image for the moment she's describing. But of course in test land the word choice only "reveals" one thing, and only one other piece of the poem keys that single meaning. I would call this poetry being explained by a mechanic, but that's disrespectful to mechanics.

PAGE ELEVEN: MORE BUTCHERY

Determine the central idea of Sexton's poem, as well as specific details that develop the idea over the course of the poem. From the list of Possible Central Ideas, drag the best Central Idea into the Central Idea box.

Good God! This at least avoids making explicit what is implied here-- "Determine the central idea, then look for it on our list. If it's not there, you're wrong." Three of the four choices are okay-ish, two are arguable, and none would impress me if they came in as part of a student paper.

I'm also supposed to drag-and-drop three quotes that help develop the One Right Idea. So, more test prep required.

PAGE TWELVE: CONTRAST

Now my text window has tabs to toggle back and forth between the two works. I'm supposed to come up with a "key" difference between the two works (from their list of four, of course) and two quotes to back up my answer. Your answer will depend on what you think "key" means to the test writers. Hope your teacher did good test prep with you.

PAGE THIRTEEN: ESSAY TIME

In this tiny text box that will let you view about six lines of your essay at a time, write an essay "that provides and analysis of how Sexton transforms Daedelus and Icarus." Use evidence from both texts. No kidding-- this text box is tiny. And no, you can't cut and paste quotes directly from the texts.

But the big question here-- who is going to assess this, and on what basis? Somehow I don't think it's going to be a big room full of people who know both their mythology and their Sexton.

PAGE FOURTEEN: ABIGAIL ADAMS

So now we're on to biography. It's a selection from the National Women's History Museum, so you know it is going to be a vibrant and exciting text. I suppose it could be worse--we could be reading from an encyclopedia.

The questions want to know what "advocate for women" means, and to pick an example of Adams being an advocate. In other words, the kinds of questions that my students would immediately id as questions that don't require them to actually read the selection.

PAGE FIFTEEN: ADAMS

This page wants to know which question goes unanswered by the selection, and then for Part B asks to select a statement that is true about the biography but which supports the answer for A. Not hopelessly twisty.

PAGE SIXTEEN: MORE BIO

Connect the two central ideas of this selection. So, figure out what the writers believe are the two main ideas, and then try figure out what they think the writers see as a connection. Like most of these questions, these will be handled backwards. I'm not going to do a close reading of the selection-- I'm going to close read the questions and answers and then use the selection just as a set of clues about which answer to pick. And this is how answering multiple choice questions about a short selection is a task not much like authentic reading or pretty much any other task in the world.

PAGE SEVENTEEN: ABIGAIL LETTER

Now we're going to read the Adams family mail. This is one of her letters agitating for the rights of women; our questions will focus on her use of "tyrant" based entirely on the text itself, because no conversation between Abigail and John Adams mentioning tyranny in 1776 could possibly be informed by any historical or personal context.

PAGE EIGHTEEN: STILL VIOLATING FOUNDING FATHER & MOTHER PRIVACY

Same letter. Now I'm supposed to decide what the second graph most contributes to the text as a whole. Maybe I'm just a Below basic kind of guy, but I am pretty sure that the correct answer is not among the four choices. That just makes it harder to decide which other two paragraphs expand on the idea of graph #2.

PAGE NINETEEN: BOSTON

Now we'll decide what her main point about Boston is in the letter. This is a pretty straightforward and literal reading for details kind of question. Maybe the PARCC folks are trying to boost some morale on the home stretch here.

Oh hell. I have a message telling me I have less than five minutes left.

PAGE TWENTY: JOHN'S TURN

Now we have to pick the paraphrase of a quote from Adams that the test writers think is the berries. Another set of questions that do not require me to actually read the selection, so thank goodness for small favors.

PAGE TWENTY-ONE: MORE JOHN

Again, interpretation and support. Because making sense out of colonial letter-writing English is just like current reading. I mean, we've tested me on a boring general science piece, classical poetry, modern poetry, and a pair of colonial letters. Does it seem like that sampling should tell us everything there is to know about the full width and breadth of student reading ability?

PAGE TWENTY-TWO: BOTH LETTERS

Again, in one page, we have two sets of scrollers, tabs for toggling between works, and drag and drop boxes for the answers. Does it really not occur to these people that there are students in this country who rarely-if-ever lay hands on a computer?

This is a multitask page. We're asking for a claim made by the writer and a detail to back up that claim, but we're doing both letters on the same page and we're selecting ideas and support only from the options provided by the test. This is not complex. It does not involve any special Depth of Knowledge. It's just a confusing mess.

PAGE TWENTY-THREE: FINAL ESSAY

Contrast the Adams' views of freedom and independence. Support your response with details from the three sources (yes, we've got three tabs now). Write it in this tiny text box.

Do you suppose that somebody's previous knowledge of John and Abigail and the American Revolution might be part of what we're inadvertently testing here? Do you suppose that the readers who grade these essays will themselves be history scholars and writing instructors? What, if anything, will this essay tell us about the student's reading skills?

DONE

Man. I have put this off for a long time because I knew it would give me a rage headache, and I was not wrong. How anybody can claim that the results from a test like this would give us a clear, nuanced picture of student reading skills is beyond my comprehension. Unnecessarily complicated, heavily favoring students who have prior background knowledge, and absolutely demanding that test prep be done with students, this is everything one could want in an inauthentic assessment that provides those of us in the classroom with little or no actual useful data about our students.

If this test came as part of a packaged bunch of materials for my classroom, it would go in the Big Circular File of publishers materials that I never, ever use because they are crap. What a bunch of junk. If you have stuck it out with me here, God bless you. I don't recommend that you give yourself the full PARCC sample treatment, but I heartily recommend it to every person who declares that these are wonderful tests that will help revolutionize education. Good luck to them as well.

Tuesday, December 23, 2014

Setting Cut Scores

Benchmark is originally a surveying term. Benchmarks are slots cut into the side of stone (read "permanent") structures into which a bench (basically a little shelf) can be inserted for surveying purposes. We know they're at a certain level because they've been measured in relation to another marker which has been measured in relation to another marker and so on retrogressively until we arrive at a Mean Sea Level marker (everything in surveying is ultimately measured in relation to one of those).

Surveying markers, including benchmarks, are literally set in stone. Anybody with the necessary training can find them always in the same place and measure any other point in relation to them.

This metaphorical sense of unwavering objective measure is what many folks carry with them to their consideration of testing and cut scores. Passing, failing, and excellence, they figure, are all measured against some scholarly Mean Sea Level marker by way of benchmarks that have been carefully measured against MSL and set in stone.

Sorry, no. Instead, cut scores represent an ideal somewhere between a blindfolded dart player with his fingers duct-taped together, and the guy playing against the blindfolded dart player who sets the darts exactly where he wants them.

Writing in the Stamford Advocate, Wendy Lecker notes that the Smarter Balanced Assessment Consortium members (including Connecticut's own committed foe of public education Commissioner Stefan Pryor) set cut scores for the SBA tests based on stale fairy dust and the wishes of dying puppies.

People tend to assume that cut scores-- the borderline between Good Enough and Abject Failure-- mean something. If a student fails The Test, she must be unready for college or unemployable or illiterate or at the very least several grades behind where she's Supposed To Be (although even that opens up the question "Supposed by whom?")

In fact, SBAC declares that the achievement levels "do not equate directly to expectations for `on-grade' performance" and test scores should only be used with multiple other sources of information about schools and students.

Furthermore, "SBAC admits it cannot validate whether its tests measure college readiness until it has data on how current test takers do in college."

If you are imagining that cut scores for the high-stakes accountability tests are derived through some rigorous study of exactly what students need to know and what level of proficiency they should have achieved by a certain age-- well, first, take a look at what you're assuming. Did you really think we have some sort of master list, some scholastic Mean Sea Level that tells us exactly what a human being of a certain age should know and be able to do as agreed upon by some wise council of experty experts? Because if you do, you might as well imagine that those experts fly to their meetings on pink pegasi, a flock of winger horsies that dance on rainbows and take minutes of the Wise Expert meetings by dictating to secretarial armadillos clothed in shimmering mink stoles.

Anyway, it doesn't matter because there are no signs that any of these people associated with The Test are trying to work with a hypothetical set of academic standards anyway. Instead, what we see over and over (even back in the days of NCLB), is educational amateurs setting cut scores for political purposes. So SBAC sets a cut score so that almost two thirds of the students will fail. John King in New York famously predicted the percentage of test failure before the test was even out the door-- but the actual cut scores were set after the test was taken.

That is not how you measure a test result against a standard. That's how you set a test standard based on the results you want to see. It's how you make your failure predictions come true. According to Carol Burris, King also attempted to find some connection between SAT results and college success prediction, and then somehow graft that onto a cut score for the NY tests, while Kentucky and other CCSS states played similar games with the ACT.

Setting cut scores is not an easy process. Education Sector, a division of the thinky tank American Institutes for Research (they specialize in behavioral sciency thinking, and have a large pedigree in the NCLB era and beyond), issued an "explainer" in July of 2006 about how states set passing scores on standardized tests. It leads off its section on cut scores with this:

On a technical level, states set cut scores along one of two dimensions: The characteristics of the test items or the characteristics of the test takers.It is essential to understand that either way is an inescapably subjective process. Just as academic standards are ultimately the result of professional judgment rather than absolute truth, there is no “right” way to set cut scores, and different methods have various strengths and weaknesses.

The paper goes on to talk about setting cut scores, and some of it is pretty technical, but it returns repeatedly to the notion that at various critical junctures, some human being is going to make a judgment call.

Educational Testing Service (ETS) also has a nifty "Primer on Setting Cut Scores on Tests of Educational Achievement." Again, from all the way back in 2006, this gives a quick compendium of various techniques for setting cut scores-- it lists eight different methods. And it also opens with some insights that would still be useful to consider today.

The first step is for policymakers to specify exactly why cut scores are being set in the first place. The policymakers should describe the benefits that are expected from the use of cut scores. What decisions will be made on the basis of the cut scores? How are those decisions being made now in the absence of cut scores? What reasons are there to believe that cut scores will result in better decisions? What are the expected benefits of the improved decisions?

Yeah, those conversations have not been happening within anyone's earshot. Then there is this:

It is important to list the reasons why cut scores are being set and to obtain consensus among stakeholders that the reasons are appropriate. An extremely useful exercise is to attempt to describe exactly how the cut scores will bring about each of the desired outcomes. It may be the case that some of the expected benefits of cut scores are unlikely to be achieved unless major educational reforms are accomplished. It will become apparent that cut scores, by themselves, have very little power to improve education. Simply measuring a child and classifying the child’s growth as adequate or inadequate will not help the child grow.

Oh, those crazy folks of 2006. Little did they know that in a few years education reform and testing would be fully committed and devoted to the notion that you can make a pig gain weight by weighing it. All this excellent advice about setting cut scores, and none of it appears to be getting use these days.

I'm not going to go too much more into this document from a company that specializes in educational testing, except to note that once again, the paper frequently notes that personal and professional judgment is a factor at several critical junctures. I will note that they include this step--

The next step is for groups of educators familiar with students in the affected grades and familiar with the subject matter to describe what students should know and be able to do to reach the selected performance levels.

They also are clear that selecting the judges who will set cut scores means making sure they are qualified, have experience, and reflect a demographic cross section. They suggest that policymakers consider fundamental questions such as is it better to pass a student who should fail, or fail a student who should pass? And they are also clear that the full process of setting the cut scores should be documented in painstaking detail, including the rationale for methodology and qualifications of the judges.

And they do refer uniformly to the score-setters as judges, because the whole process involves-- say it with me-- judgment.

People dealing with test scores and test results must remember that setting cut scores is not remotely like the process of surveying with benchmarks. Nothing is set in stone, nothing is judged based on its relationship to something set in stone, and everything is set by people using subjective judgment, not objective standards. We always need to be asking what a cut score is based on, and whether it is any better than a Wild Assed Guess. And when cut cores are set to serve a political purpose, we are right to question whether they have any validity at all.

Thursday, October 30, 2014

PARCC Is Magical

Today David Hespe, the acting education commissioner in New Jersey, sent out a letter to Chief School Administrators, Charter School Lead Persons, School Principals, and Test Coordinators.

The re: is "Student Participation in the Statewide Assessment Program." Specifically, it's "why there ought to be some, and how you handle uppity folks who want to avoid it."

In the two page letter, the first page and a half are taken up with a history lesson and a legal brief. Basically, "some laws have been passed, starting with No Child Left Behind, and we think they mean that students have to take the PARCC." (If you want to see the faux legal argument dismantled, check out Sarah Blaine's piece here.)

But then Hespe, correctly suspecting that this might not be sufficient for dealing with recalcitrant parental units, offers this magical paragraph:

In speaking with parents and students, it is perhaps most important to outline the positive reasons that individual students should participate in the PARCC examinations. Throughout a student’s educational career, the PARCC assessments will provide parents with important information about their child’s progress toward meeting the goal of being college or career ready. The PARCC assessments will, for the first time, provide detailed diagnostic information about each individual student’s performance that educators, parents and students can utilize to enhance foundational knowledge and student achievement. PARCC assessments will include item analysis which will clarify a student’s level of knowledge and understanding of a particular subject or area of a subject. The data derived from the assessment will be utilized by teachers and administrators to pinpoint areas of difficulty and customize instruction accordingly. Such data can be accessed and utilized as a student progresses to successive school levels.

The Partnership for Assessment of Readiness for College and Careers (forgot that's what PARCC stands for, didn't you) is a magical magical test. It can tell with absolute precision, how prepared your student is for college or career because, magic. And who wouldn't want to know more about the powerful juju contained in the PARCC test.

So if Mr. Hespe and any of his friends come to explain how crucial PARCC testing is for your child's future, you might try asking some questions.

* Exactly what is the correspondence between PARCC results and college readiness. Given the precise data, can you tell me what score my eight year old needs to get on the test to be guaranteed at least a 3.75 GPA at college?

* Does it matter which college he attends, or will test results guarantee he is ready for all colleges?

* Can you show me the research and data that led you to conclude that Test Result A = College Result X? How exactly do you know that meeting the state's politically chosen cut score means that my child is prepared to be a college success?

* Since the PARCC tests math and language, will it still tell me if my child is ready to be a history or music major? How about geology or women's studies?

* My daughter plans to be a stay-at-home mom. Can she skip the test? Since that's her chosen career, is there a portion of the PARCC that tests her lady parts and their ability to make babies?

* Which section of the PARCC tests a student's readiness to start a career as a welder? Is it the same part that tests readiness to become a ski instructor, pro football player, or dental assistant?

* I see that the PARCC will be used to "customize instruction." Does that mean you're giving the test tomorrow (because it'a almost November already)? How soon will the teacher get the detailed customizing information-- one week? Ten days? How will the PARCC results help my child's choir director and phys ed teacher customize instruction?

* Is it possible that the PARCC will soon be able to tell me if my eight year old is on track for a happy marriage and nice hair?

* Why do you suppose you keep using the word "utilize" when "using" is a perfectly good plain English substitute?

* To quote the immortal Will Smith in Independence Day, "You really think you can do all that bullshit you just said?"

The PARCC may look like just one more poorly-constructed standardized math and language test, but it is apparently super-duper magical, with the ability to measure every aspect of a child's education and tell whether the child is ready for college and career, regardless of which college, which major, which career, and which child we are talking about. By looking at your eight year old's standardized math and language test, we can tell whether she's on track to be a philosophy major at Harvard or an airline pilot! It's absolutely magical!

Never has a single standardized test claimed so much magical power with so little actual data to back up its assertions. Mr. Hespe would be further ahead to skip his fancy final paragraph and just tell his people to look parents in the eye and say, "Because the state says so." It's not any more educationally convincing than the magical CACR bullshit, but at least it would be honest.

Monday, October 20, 2014

Questioning the Test

Sarah Blaine blogs over at parentingthecore, and while she is not a very prolific, her posts are often thoughtful and thought-provoking (she is the same blogger who dissected the implications of the Pearson wrong answer).

Blaine has been getting ready for PARCC Family Presentation night at her daughter's school, and she has prepared a list that I think would be an entirely appropriate set of questions for anyone to ask a school board, elected official, or education department bureaucrat who started making noise about the awesomeness of the Testing Regime we now live under. You should just follow the link to read the full piece, but let me give you a taste.

Some of the questions address the nuts and bolts of testing, but hit right at the heart of testing issues. There are some obvious ones, like:

How many hours of testing for 3rd graders? 4th graders? 5th graders?

But this next one is one of my favorites, precisely because it isn't asked often enough:

What in-district adults are proctoring and reviewing the PARCC tests to ensure that the test questions are not poorly worded, ambiguous, and/or that correct answer choices are provided for multiple choice tasks?

These are also winners:

What data do you expect to receive from PARCC that will be available to classroom teachers to guide instruction? When will PARCC scores and results be available?

Who scores the subjective portions of the PARCC tests? What are those people’s qualifications?

What steps are you taking to ensure that our 8, 9, and 10 year old students have the typing skills necessary to compose essays with keyboards? How much time is being spent on preparing children to acquire the skills necessary to master the PARCC interface? Is the preparation process uniform throughout the district? If it is not, doesn’t this mean that we won’t be able to make apples-to-apples comparisons of student scores even across the district?

Some of Blaine's questions are considerably more in-your-face, which is why I love them:

Will students lose points on math assessments if they do not use specific Common Core strategies to solve problems (e.g., performing multiplication the traditional way rather than drawing an array)? My child lost full credit on the following Envisions math test problem this year: “Write a multiplication sentence for 3 + 3 + 3 + 3 + 3 = 15″ because she wrote 3 x 5 = 15 instead of 5 x 3 = 15. Will children be losing points on PARCC for failure to make meaningless distinctions such as this one?

There are plenty more where these came from, including links to articles and information that help inform the area in question. And though she was aiming at the PARCC, her list works just fine for whatever big dumb high stakes test your part of the world is pushing.

The world needs more of these questions. Too many people responsible for providing some form of educational leadership keep just doing dumb things because nobody asks them any questions or challenges any of their dumb proposals. It would be fun to watch what happened if a whole group of parents attended a meeting with Blaine's questions in hand.

Pages