You may recall that the last time we checked in, York PA was on a fast track to the suspension of democracy. But that train has been called back to the station.
York Schools were among the PA schools suffering sever financial distress (PA has operated with a school funding system that produces a lot of local financial hardship). The previous administration of Tom Corbett had used that as a trigger to install a district recovery office, and just a few months ago-- almost as if we were in a hurry to do a deal before the new governor took office-- a PA judged ruled that the district could go into receivership, a nifty system in which the democraticaly elected school board is stripped of power and the state-appointed receiver could do as he wished.
What David Meckley, the receiver, wanted to do was turn the whole district over to for-profit charter chain, Charter Schools USA. Lots of people thought that was an awful idea (among other problems, there was no reason to believe that CSUSA had a clue what to do with the district once they took it over). The judge who ruled in the case did so based on close reading of the law, declaring that even if the plan was clearly terrible, that wasn't his problem. That ruling was being appealed.
But now all of that has come to a screeching halt.
The full account is in Friday's York Daily Record. The short headline version is simple-- David Meckley has resigned as recovery officer. The longer version is encouraging for Pennsylvanians (like me) who weren't really sure which way new governor Tom Wolf's wind would be blowing-- Meckley resigned because the governor's office made it plain that charters were off the table.
There was apparently an intermediate stage, during which Meckley and locals and the state fiddled with a charter-public mix plan.
Meckley said in an interview that, around
December, he, district administrators, the proposed charter board and
some community leaders had crafted an alternative plan that involved a
mix of district- and charter-run buildings. He said he had significant
conversations with the Wolf administration about it, but "ultimately the
position came down that charters are off the table."
And so, reading the writing on the wall, Meckley has resigned, and the search for a new receiver is on. The board president is wryly hopeful.
"My understanding is they wanted to put someone in that position who knows about the educational aspect of schools," she said.
Meckley, even on his way out the door, continued to demonstrate that he was not that education-understanding guy by expressing his belief that a receivership was necessary because if the schools weren't going to be punished into excellence, they would never get there (I'm paraphrasing).
Wolf has stated, via his proposed budget, his intention to get funding back up to a higher level in Pennsylvania. What the budget will actually look like once it gets past the GOP-controlled legislature is another question. But this move in York follows Wolf's replacement of the chairman of the board that runs Philly schools after the defrocked chair approved more charters in Philly in opposition to Wolf's stated desire to have no more Philly charters.
Meanwhile, York has plenty of problems still to solve. The York Daily Record quotes Clovis Gallon, a teacher who was one of the leaders of the local charter opposition:
"Clearly we recognize the fact there's a lot of
work to do with our students, with our community, with our school
district," he said. "We're ready to accept that challenge. As a parent,
as a teacher, I'm ready to accept the challenge."
Saturday, March 14, 2015
Pearson Proves PARCC Stinks
When I was in tenth grade, I took a course called Biological Sciences Curriculum Studies (BSCS). It was a course known for its rigor and for its exceedingly tough tests.
The security on these tests? Absolutely zero. We took them as take-home tests. We had test-taking parties. We called up older siblings who were biology majors. The teacher knew we did these things. The teacher did not care, and it did not matter, because the tests required reasoning and application of the basic understanding of the scientific concepts. It wasn't enough, for instance, to know the parts of a single-celled organism-- you had to work out how those parts were analogous to the various parts of a city where the residents made pottery. You had to break down the implications of experimental design. And as an extra touch, after taking the test for a week outside of class, you had to take a different version of the same test (basically the same questions in a different order) in class.
Did people fail these zero-security take home tests? Oh, yes. They did.
I often think of those tests these days, because they were everything that modern standardized test manufacturers claim their tests are.
Test manufacturers and their proxies tell us repeatedly that their tests require critical thinking, rigorous mental application, answering questions with more than just rote knowledge.
They are lying.
They prove they are lying with their relentless emphasis on test security. Teachers may not look at the test, cannot so much as read questions enough to understand the essence of them. Students, teacher, and parents are not allowed to know anything specific about student responses after the fact (making the tests even less useful than the could possibly be).
And now, of course, we've learned that Pearson apparently has a super-secret cyber-security squad that just cruises the interwebs, looking for any miscreant teens who are violating the security of the test and calling the state and local authorities to have that student punished(and, perhaps, mounting denial of service attacks on any bloggers who dare to blog about it).
This shows a number of things, not the least of which is what everyone should already have know-- Pearson puts its own business interests ahead of anything and everything.
But it also tells us something about the test.
You know what kind of test need this sort of extreme security? A crappy one.
Questions that test "critical thinking" do not test it by saying, "Okay, you can only have a couple of minutes to read and think about this because if you had time to think about it, that wouldn't be critical thinking." A good, solid critical thinking question could take weeks to answer.
Test manufacturers and their cheerleaders like to say that these tests are impervious to test prep-- but if that were true, no security would be necessary. If the tests were impervious to any kind of advance preparation aimed directly at those tests, test manufacturers would be able to throw the tests out there in plain sight, like my tenth grade biology teacher did.
A good assessment has no shortcuts and needs no security. Look at performance-based measures-- no athlete shows up at an event and discovers at that moment, "Surprise! Today you're jumping over that bar!"
Authentic assessment is no surprise at all. It is exactly what you expect because it is exactly what yo prepared for, exactly what you've been doing all along-- just, this time, for a grade.
Big Stupid Test manufacturers insist that their test must be a surprise, that nobody can know anything about it, is a giant, screaming red alarm signal that these tests are crap. In what other industry can you sell a customer a product and refuse to allow them to look at it! It's like selling the emperor his new clothes and telling him they have to stay in the factory closet. Who falls for this kind of bad sales pitch? "Let me sell you this awesome new car, but you can never drive it and it will stay parked in our factory garage. We will drive you around in it, but you must be blindfolded. Trust us. It's a great car." Who falls for that??!!
The fact that they will go to such extreme and indefensible lengths to preserve the security of their product is just further proof that their product cannot survive even the simplest scrutiny.
The fact that product security trumps use of the product just raises this all to a super-kafka-esque level. It is more important that test security be maintained than it is that teachers and parents get any detailed and useful information from it. Test fans like to compare these tests to, say, tests at a doctor's office. That's a bogus comparison, but even if it weren't, test manufacturers have created a doctors office in which the doctor won't tell you what test you're getting, and when the test results come back STILL won't tell you what kind of test they gave you and will only tell you whether you're sick or well-- but nothing else because the details of your test results are proprietary and must remain a secret.
Test manufacturers like Pearson are right about one thing-- we don't need the tests to know how badly they suck, because this crazy-pants emphasis on product security tells us all we need to know. These are tests that can't survive the light of day, that are so frail and fragile and ineffectual that these tests can never be tested, seen, examined, or even, apparently, discussed.
Test manufacturers are telling us, via their security measures, just how badly these tests suck. People just have to start listening.
The security on these tests? Absolutely zero. We took them as take-home tests. We had test-taking parties. We called up older siblings who were biology majors. The teacher knew we did these things. The teacher did not care, and it did not matter, because the tests required reasoning and application of the basic understanding of the scientific concepts. It wasn't enough, for instance, to know the parts of a single-celled organism-- you had to work out how those parts were analogous to the various parts of a city where the residents made pottery. You had to break down the implications of experimental design. And as an extra touch, after taking the test for a week outside of class, you had to take a different version of the same test (basically the same questions in a different order) in class.
Did people fail these zero-security take home tests? Oh, yes. They did.
I often think of those tests these days, because they were everything that modern standardized test manufacturers claim their tests are.
Test manufacturers and their proxies tell us repeatedly that their tests require critical thinking, rigorous mental application, answering questions with more than just rote knowledge.
They are lying.
They prove they are lying with their relentless emphasis on test security. Teachers may not look at the test, cannot so much as read questions enough to understand the essence of them. Students, teacher, and parents are not allowed to know anything specific about student responses after the fact (making the tests even less useful than the could possibly be).
And now, of course, we've learned that Pearson apparently has a super-secret cyber-security squad that just cruises the interwebs, looking for any miscreant teens who are violating the security of the test and calling the state and local authorities to have that student punished(and, perhaps, mounting denial of service attacks on any bloggers who dare to blog about it).
This shows a number of things, not the least of which is what everyone should already have know-- Pearson puts its own business interests ahead of anything and everything.
But it also tells us something about the test.
You know what kind of test need this sort of extreme security? A crappy one.
Questions that test "critical thinking" do not test it by saying, "Okay, you can only have a couple of minutes to read and think about this because if you had time to think about it, that wouldn't be critical thinking." A good, solid critical thinking question could take weeks to answer.
Test manufacturers and their cheerleaders like to say that these tests are impervious to test prep-- but if that were true, no security would be necessary. If the tests were impervious to any kind of advance preparation aimed directly at those tests, test manufacturers would be able to throw the tests out there in plain sight, like my tenth grade biology teacher did.
A good assessment has no shortcuts and needs no security. Look at performance-based measures-- no athlete shows up at an event and discovers at that moment, "Surprise! Today you're jumping over that bar!"
Authentic assessment is no surprise at all. It is exactly what you expect because it is exactly what yo prepared for, exactly what you've been doing all along-- just, this time, for a grade.
Big Stupid Test manufacturers insist that their test must be a surprise, that nobody can know anything about it, is a giant, screaming red alarm signal that these tests are crap. In what other industry can you sell a customer a product and refuse to allow them to look at it! It's like selling the emperor his new clothes and telling him they have to stay in the factory closet. Who falls for this kind of bad sales pitch? "Let me sell you this awesome new car, but you can never drive it and it will stay parked in our factory garage. We will drive you around in it, but you must be blindfolded. Trust us. It's a great car." Who falls for that??!!
The fact that they will go to such extreme and indefensible lengths to preserve the security of their product is just further proof that their product cannot survive even the simplest scrutiny.
The fact that product security trumps use of the product just raises this all to a super-kafka-esque level. It is more important that test security be maintained than it is that teachers and parents get any detailed and useful information from it. Test fans like to compare these tests to, say, tests at a doctor's office. That's a bogus comparison, but even if it weren't, test manufacturers have created a doctors office in which the doctor won't tell you what test you're getting, and when the test results come back STILL won't tell you what kind of test they gave you and will only tell you whether you're sick or well-- but nothing else because the details of your test results are proprietary and must remain a secret.
Test manufacturers like Pearson are right about one thing-- we don't need the tests to know how badly they suck, because this crazy-pants emphasis on product security tells us all we need to know. These are tests that can't survive the light of day, that are so frail and fragile and ineffectual that these tests can never be tested, seen, examined, or even, apparently, discussed.
Test manufacturers are telling us, via their security measures, just how badly these tests suck. People just have to start listening.
Pearson Is Big Brother
You've already heard the story by now-- Pearson has been found monitoring students on social media in New Jersey, catching them tweeting about the PARCC test, and contacting the state Department of Education so that the DOE can contact the local school district to get the students in trouble.
You can read the story here at the blog of NJ journalist Bob Braun. Well, unless the site is down again. Since posting the story, Braun's site has gone down twice that I know of. Initially it looked like Braun had simply broken the internet, as readers flocked to the report. Late last night Braun took to facebook to report that the site was under attack and that he had taken it down to stop the attack. As I write this (6:17 AM Saturday) the site and the story are up, though loading slowly.
The story was broken by Superintendent Elizabeth Jewett of Watchung Hills Regional High School district in an email to her colleagues. But in contacting Jewett he has learned that she confirmed three instances in which Pearson contacted the NJDOE to turn over miscreant students for the state to track down and punish. [Update: Jewett here authenticates the email that Braun ran.]
Meanwhile, many alert eyes turned up this: Pearson's Tracx, a program that may or may not allow the kind of monitoring we're talking about here.
Several thoughts occur. First, under exactly whose policy are these students to be punished. Does the PARCC involve them taking the same kind of high security secrecy pledge that teachers are required to take, and would such a pledge signed by a minor, anyway?
How does this fit with the ample case law already establishing that, for instance, students can go on line and create websites or fake facebook accounts mocking school administrators? They can mock their schools, but they have to leave important corporations alone?
I'm also wondering, again, how any test that requires this much tight security could not suck. Seriously.
How much of the massive chunk of money paid by NJ went to the line item "keep an eye on students on line?"
Granted, the use of the word "spying" is a bit much-- social media are not exactly secret places where the expectation of privacy is reasonable or enforceable, and spying on someone there is a little like spying on someone in a Wal-mart. But it's still creepy, and it's still one more clear indicator that Pearson's number one concern is Pearson's business interests, not students or schools or anything else. And while this is not exactly spying, the fact that Pearson never said a public word about their special test police cyber-squad, not even to spin it in some useful way, shows just how far above student, school, and state government they envision themselves to be.
Pearson really is Big Brother-- and not just to students, but to their parents, their schools, and their state government. It's time to put some serious pressure on politicians. If they're even able to stand up to Pearson at this point, now is the time for them to show us.
You can read the story here at the blog of NJ journalist Bob Braun. Well, unless the site is down again. Since posting the story, Braun's site has gone down twice that I know of. Initially it looked like Braun had simply broken the internet, as readers flocked to the report. Late last night Braun took to facebook to report that the site was under attack and that he had taken it down to stop the attack. As I write this (6:17 AM Saturday) the site and the story are up, though loading slowly.
The story was broken by Superintendent Elizabeth Jewett of Watchung Hills Regional High School district in an email to her colleagues. But in contacting Jewett he has learned that she confirmed three instances in which Pearson contacted the NJDOE to turn over miscreant students for the state to track down and punish. [Update: Jewett here authenticates the email that Braun ran.]
Meanwhile, many alert eyes turned up this: Pearson's Tracx, a program that may or may not allow the kind of monitoring we're talking about here.
Several thoughts occur. First, under exactly whose policy are these students to be punished. Does the PARCC involve them taking the same kind of high security secrecy pledge that teachers are required to take, and would such a pledge signed by a minor, anyway?
How does this fit with the ample case law already establishing that, for instance, students can go on line and create websites or fake facebook accounts mocking school administrators? They can mock their schools, but they have to leave important corporations alone?
I'm also wondering, again, how any test that requires this much tight security could not suck. Seriously.
How much of the massive chunk of money paid by NJ went to the line item "keep an eye on students on line?"
Granted, the use of the word "spying" is a bit much-- social media are not exactly secret places where the expectation of privacy is reasonable or enforceable, and spying on someone there is a little like spying on someone in a Wal-mart. But it's still creepy, and it's still one more clear indicator that Pearson's number one concern is Pearson's business interests, not students or schools or anything else. And while this is not exactly spying, the fact that Pearson never said a public word about their special test police cyber-squad, not even to spin it in some useful way, shows just how far above student, school, and state government they envision themselves to be.
Pearson really is Big Brother-- and not just to students, but to their parents, their schools, and their state government. It's time to put some serious pressure on politicians. If they're even able to stand up to Pearson at this point, now is the time for them to show us.
Friday, March 13, 2015
PA: All About the Tests (And Poverty)
In Pennsylvania, we rate schools with the School Performance Profile (SPP). Now a new research report reveals that the SPP is pretty much just a means of converting test scores into a school rating. This has huge implications for all teachers in PA because our teacher evaluations include the SPP for the school at which we teach.
Research for Action, a Philly-based education research group, just released its new brief, "Pennsylvania'a School Performance Profile: Not the Sum of Its Parts." The short version of its findings are pretty stark and not very encouraging--
90% of the SPP is directly based on test results.
90%.
SPP is our answer to the USED waiver requirement for a test-based school-level student achievement report. It replaces the old Adequate Yearly Progress of NCLB days by supposedly considering student growth instead of simple raw scores. It rates schools on a scale of 0-100, with 70 or above considered "passing." In addition to being used to rate schools and teachers, SPP's get trotted out any time someone wants to make a political argument about failing schools.
RFA was particularly interested in looking at the degree to which SPP actually reflects poverty level, and their introduction includes this sentence:
Studies both in the United States and internationally have established a consistent, negative link between poverty and student outcomes on standardized tests, and found that this relationship has become stronger in recent years.
Emphasis mine. But let's move on.
SPP is put together from a variety of calculations performed on test scores. Five of the six-- which account for 90% of the score-- "rely entirely on test scores."
Our analysis finds that this reliance on test scores, despite the partial use of growth measures, results in a school rating system that favors more advantaged schools.
Emphasis theirs.
The brief opens with a consideration of the correlation of SPP to poverty. I suggest you go look at the graph for yourself, but I will tell you that you don't need any statistics background at all to see the clear correlation between poverty and a lower SPP. And as we break down the elements of the SPP, it's easy to see why the correlation is there.
Indicators of Academic Achievement (40%)
Forty percent of the school's SPP comes from a proficiency rating (aka just plain straight on test results) that comes from tested subjects, third grade read, and the SAT/ACT College Ready Benchmark. Whether we're talking third grade reading or high school Keystone exams, "performance declines as poverty increases."*
Out of 2,200 schools sampled, 187 had proficiency ratings higher than 90, and only seven of those had more than 50% economically disadvantaged enrollment. Five of those were Philly magnet schools.
Indicators of Academic Growth aka PVAAS (40%)
PVAAS is our version of a VAM rating, in which we compare actual student performance to the performance of imaginary students in an alternate neutral universe run through a magical formula that corrects for everything in the world except teacher influence. It is junk science.
RFA found that while the correlation with poverty was still there, when it came to PSSAs (our elementary test) it was not quite as strong as the proficiency correlation. For the Keystones, writing and science tests, however, the correlation with poverty is, well, robust. Strong. Undeniable. Among other things, this means that you can blunt the impact of Keystone test results by getting some PSSA test-takers under the same roof. Time to start that 5-9 middle school!!
Closing the Achievement Gap (10%)
This particular measure has a built-in penalty for low-achieving schools (aka high poverty schools-- see above). Basically, you've got six years to close half the proficiency gap between where you are and 100%. If you have 50% proficiency, you've got six years to hit 75%. If you have 60%, you have six years to hit 80%. The lower you are, the more students you must drag over the test score finish line.
That last 10%, incidentally, is items like graduation rate and attendance rate. Pennsylvania also gives you points for the number of students you can convince to buy the products and services of the College Board, including AP stuff and PSAT. So kudos to the College Board people on superior product placement. Remember kids-- give your money to the College Board. It's the law!
Bottom line-- we have schools in PA being judged directly on test performance, and we have data once again clearly showing that the state could save a ton of money by simply issuing school ratings based on the income level of students.
For those who want to complain, "How dare you say those poor kids can't achieve," I'll add this. We aren't measuring whether poor kids can achieve, learn, accomplish great things, or grow up to be exemplary adults-- there is no disputing that they can do all those things. But we aren't measuring that. We are measuring how well they do on a crappy standardized test, and the fact that poverty correlates with results on that crappy test should be a screaming red siren that the crappy test is not measuring what people claim it measures.
*Correction: I had originally include a mistyping here that reversed the meaning of the study.
Research for Action, a Philly-based education research group, just released its new brief, "Pennsylvania'a School Performance Profile: Not the Sum of Its Parts." The short version of its findings are pretty stark and not very encouraging--
90% of the SPP is directly based on test results.
90%.
SPP is our answer to the USED waiver requirement for a test-based school-level student achievement report. It replaces the old Adequate Yearly Progress of NCLB days by supposedly considering student growth instead of simple raw scores. It rates schools on a scale of 0-100, with 70 or above considered "passing." In addition to being used to rate schools and teachers, SPP's get trotted out any time someone wants to make a political argument about failing schools.
RFA was particularly interested in looking at the degree to which SPP actually reflects poverty level, and their introduction includes this sentence:
Studies both in the United States and internationally have established a consistent, negative link between poverty and student outcomes on standardized tests, and found that this relationship has become stronger in recent years.
Emphasis mine. But let's move on.
SPP is put together from a variety of calculations performed on test scores. Five of the six-- which account for 90% of the score-- "rely entirely on test scores."
Our analysis finds that this reliance on test scores, despite the partial use of growth measures, results in a school rating system that favors more advantaged schools.
Emphasis theirs.
The brief opens with a consideration of the correlation of SPP to poverty. I suggest you go look at the graph for yourself, but I will tell you that you don't need any statistics background at all to see the clear correlation between poverty and a lower SPP. And as we break down the elements of the SPP, it's easy to see why the correlation is there.
Indicators of Academic Achievement (40%)
Forty percent of the school's SPP comes from a proficiency rating (aka just plain straight on test results) that comes from tested subjects, third grade read, and the SAT/ACT College Ready Benchmark. Whether we're talking third grade reading or high school Keystone exams, "performance declines as poverty increases."*
Out of 2,200 schools sampled, 187 had proficiency ratings higher than 90, and only seven of those had more than 50% economically disadvantaged enrollment. Five of those were Philly magnet schools.
Indicators of Academic Growth aka PVAAS (40%)
PVAAS is our version of a VAM rating, in which we compare actual student performance to the performance of imaginary students in an alternate neutral universe run through a magical formula that corrects for everything in the world except teacher influence. It is junk science.
RFA found that while the correlation with poverty was still there, when it came to PSSAs (our elementary test) it was not quite as strong as the proficiency correlation. For the Keystones, writing and science tests, however, the correlation with poverty is, well, robust. Strong. Undeniable. Among other things, this means that you can blunt the impact of Keystone test results by getting some PSSA test-takers under the same roof. Time to start that 5-9 middle school!!
Closing the Achievement Gap (10%)
This particular measure has a built-in penalty for low-achieving schools (aka high poverty schools-- see above). Basically, you've got six years to close half the proficiency gap between where you are and 100%. If you have 50% proficiency, you've got six years to hit 75%. If you have 60%, you have six years to hit 80%. The lower you are, the more students you must drag over the test score finish line.
That last 10%, incidentally, is items like graduation rate and attendance rate. Pennsylvania also gives you points for the number of students you can convince to buy the products and services of the College Board, including AP stuff and PSAT. So kudos to the College Board people on superior product placement. Remember kids-- give your money to the College Board. It's the law!
Bottom line-- we have schools in PA being judged directly on test performance, and we have data once again clearly showing that the state could save a ton of money by simply issuing school ratings based on the income level of students.
For those who want to complain, "How dare you say those poor kids can't achieve," I'll add this. We aren't measuring whether poor kids can achieve, learn, accomplish great things, or grow up to be exemplary adults-- there is no disputing that they can do all those things. But we aren't measuring that. We are measuring how well they do on a crappy standardized test, and the fact that poverty correlates with results on that crappy test should be a screaming red siren that the crappy test is not measuring what people claim it measures.
*Correction: I had originally include a mistyping here that reversed the meaning of the study.
Thursday, March 12, 2015
Raj Chetty for Dummies
The name Raj Chetty has been coming up a great deal lately, like a bad burrito that resists easy digestion. A great deal has been written about Chetty and his scholarly work, much of it by other scholars in various states of apoplexy.My goal today is not to contribute to that scholarly literature, but to try to translate the mass of writing by various erudite economists, scholars and statisticians into something shorter and simpler than ordinary civilians can understand.
In other words, I'm going to try to come up with a plain answer for the question, "Who is Raj Chetty, what does he say, how much of it is baloney, and why does anybody care?"
Who is Raj Chetty?
Chetty immigrated to the US from New Delhi at age nine. By age 23 he was an associate professor of economics at UC Berkeley, receiving tenure at at age 27. At 30, he returned to his alma mater and became the Bloomberg Professor of Economics at Harvard.
He has since become a bit of a celebrity economist, consulted and quoted by the President and members of Congress. He has won the John Bates Clark Medal; Fortune put him on their list of influential people under forty in business.
Chetty started attracting attention late in 2010 with the pre-announcement of a publication of research that would give a serious shot in the arm to the Valued Added Measurement movement in teacher evaluation. His work has also made special appearances in the State of the Union address and the Vergara trial.
What does Chetty say?
The sexy headline version of Chetty is that a child who has a great kindergarten teacher will make more money as an adult.
The unsexy version isn't much more complicated than that. What Chetty et al (he has a pair of co-authors on the study) say is that a high-VAM teacher can raise tests scores in younger students (say, K-4) and while that effect will disappear around 8th grade, eventually those VAM-exposed children will start making bigger bucks as adults.
Implications? Well, as one of Chetty's co-authors told the New York Times--
“The message is to fire people sooner rather than later,” Professor Friedman said.
Chetty's work has been used to buttress the folks who believe in firing our way to excellence-- just keep collecting VAM scores and ditching the bottom 5% of your staff. Chetty also plays well in court cases like Vergara, where it can be used to create the appearance of concrete damage to students (if Chris has Mrs. McUnvammy for first grade, Chris will be condemned to poverty in adulthood, ergo the state has an obligation to fire Mrs. McUnvammy toot suite). You can read one of the full versions of the paper here.
Who disagrees with Chetty?
Not everybody. In particular, economist Eric Hanushek has tried to join this little cottage industry, and lots of reformy poicy makers love to quote his study.
But the list of Chetty naysayers is certainly not short. Chetty appears to evoke a rather personal reaction from some folks, who characterize him as everything from a self-important twit to a clueless scientist who doesn't understand that he's building bombs that blow up real humans. I've never met the man, and nothing in his writing suggests a particular personality to me. So let's just focus on his work.
Moshe Adler at Columbia University wrote a research response to Chetty's paper for NEPC. This provoked a response from Chetty et al, which provoked yet another response from Adler. You can read the whole conversation here, but I'll warn you right now that you're not going to just scan it over lunch.
Meanwhile, you'll recall that the American Statistical Association came out pretty strongly opposed to VAM, which also put them in the position of being critical-- directly and indirectly-- of Chetty. Chetty et al took it upon themselves to deliver the ASA a lesson in statistical analyses ("I will keep my mouth shut because these people are authorities in areas outside my expertise," is apparently really hard for economists to say) which led to a conversation recounted here.
What do the scholarly and expert critics say?
To begin with, the study has a somewhat checkered publication history, debuting as news blurbs in 2010 and making its way up to publication in a non-peer-reviewed journal, then to republishing as two articles, then in a peer-reviewed journal. That history, along with many criticisms of the study, can be found here at Vamboozled, the blog of Audrey Amrein-Beardsley (the blog is a wealth of resources about all things Vam).
Many criticize Chetty's methodology. Adler's critique suggests that Chetty may have fudged some numbers, dis-included some data, and ignored previous research that didn't fit his framework. Amrein-Beardsley (and others) accuse Chetty of ignoring context of the data. Many critics suggest that Chetty is trying to make a mountain out of a molehill.
You can chase scholarly links all day long, though the NEPC link to Moshe's work and simply typing "Chetty" into the VAMboozled search box will provide more than enough reading for an afternoon. Or two.
So how much of Chetty's work is bunk?
I'm going to go with "most of it."
Chetty's idea was to link VAM measures to later success-- to be able to say, "Look! High-VAM teachers grow successful students." There are several problems with this.
First, studies of VAM-based teacher effectiveness always seem to descend into the same tautology. Use test scores to measure VAM. Use VAM to id the best teachers. Check to see if VAM-certified teachers raise test scores. Strip out the fancy language and funky math and you're left with a fairly simple tautology-- "Teachers who get students to have high test scores tend to get students to have high test scores." This is no more insightful or useful than research to show that bald men tend to be bald.
Second, Chetty doesn't seem to distinguish between correlation and causation. His results seem to scream for that consideration-- six year olds who do better on tests don't grow into twelve year olds who do better tests, but they do grow into twenty-eight olds who make more money. I'm no economist, but to me, the yawning gulf between the alleged cause and the supposed effect leaves enough room for a truckload of other possible causes. This holds together just about as well as "because I buried a toad under a full moon a year ago, I met my true love today."
And as it turns out, an explanation is readily available. We know who does better on standardized tests-- the children of high income families. We know who's more likely to get better-paying jobs as adults-- the children of high income families. It seems highly probable that the conclusion to be drawn from Chetty's research is, "Children of higher-income families do better on tests and get higher-paying jobs."
Chetty himself tried to plug that last hole, with research about economic mobility that concluded that it's not any worse than it was a decade ago-- but it's still pretty lousy. Chetty et al also insist that the students were distributed across the classrooms in completely random fashion. This strikes many as an assumption without foundation.
Put another way-- a mediocre teacher with a classroom full of rich kids who test well would earn a high VAM and those well-heeled students would still go on to have well-paying jobs, and nothing in Chetty's model would ever reveal that Mr. McMediocre was less than awesome.
There are other detail-inhabiting devils. The "big difference" in future earnings seems to vary according to which draft of the report we're looking at, and Chetty only claims them as far as the students turning twenty-eight-- the "lifetime earnings" claims are based on the assumption that the subjects will just keep getting the same raises for the rest of their lives that they got up until age twenty-eight. That is a heck of a bold assumption.
Chetty's work rests on the unproven assumption that VAM is not junk. VAM, in turn, rests on the assumption that 1) the Big Standardized Tests provide meaningful data and 2) that a magical formula can filter out all other factors related to student results on the BS Tests. Chetty's work also assumes that adult success is measured in monetary terms. And Chetty's work ignores the difference between correlation and causation, and instead makes a huge leap of faith to link cause and effect. I wold bet you dollars to donuts that we could perform research that would "prove" that eating a good breakfast when you're six, or having a nice pair of shoes when you're ten, can also be linked to higher-paying jobs in adulthood. As it is, we have "proof" that Nicholas Cage causes death by drowning, and that margarine causes divorce in Maine.
Chetty's work is not going to go away because it's sexy, it's simple, and it supports a whole host of policy ideas that people are already trying to push. But it is proof positive that just because somebody teaches at Harvard and wins awards, that doesn't mean they can't produce "research" that is absolute baloney.
In other words, I'm going to try to come up with a plain answer for the question, "Who is Raj Chetty, what does he say, how much of it is baloney, and why does anybody care?"
Who is Raj Chetty?
Chetty immigrated to the US from New Delhi at age nine. By age 23 he was an associate professor of economics at UC Berkeley, receiving tenure at at age 27. At 30, he returned to his alma mater and became the Bloomberg Professor of Economics at Harvard.
He has since become a bit of a celebrity economist, consulted and quoted by the President and members of Congress. He has won the John Bates Clark Medal; Fortune put him on their list of influential people under forty in business.
Chetty started attracting attention late in 2010 with the pre-announcement of a publication of research that would give a serious shot in the arm to the Valued Added Measurement movement in teacher evaluation. His work has also made special appearances in the State of the Union address and the Vergara trial.
What does Chetty say?
The sexy headline version of Chetty is that a child who has a great kindergarten teacher will make more money as an adult.
The unsexy version isn't much more complicated than that. What Chetty et al (he has a pair of co-authors on the study) say is that a high-VAM teacher can raise tests scores in younger students (say, K-4) and while that effect will disappear around 8th grade, eventually those VAM-exposed children will start making bigger bucks as adults.
Implications? Well, as one of Chetty's co-authors told the New York Times--
“The message is to fire people sooner rather than later,” Professor Friedman said.
Chetty's work has been used to buttress the folks who believe in firing our way to excellence-- just keep collecting VAM scores and ditching the bottom 5% of your staff. Chetty also plays well in court cases like Vergara, where it can be used to create the appearance of concrete damage to students (if Chris has Mrs. McUnvammy for first grade, Chris will be condemned to poverty in adulthood, ergo the state has an obligation to fire Mrs. McUnvammy toot suite). You can read one of the full versions of the paper here.
Who disagrees with Chetty?
Not everybody. In particular, economist Eric Hanushek has tried to join this little cottage industry, and lots of reformy poicy makers love to quote his study.
But the list of Chetty naysayers is certainly not short. Chetty appears to evoke a rather personal reaction from some folks, who characterize him as everything from a self-important twit to a clueless scientist who doesn't understand that he's building bombs that blow up real humans. I've never met the man, and nothing in his writing suggests a particular personality to me. So let's just focus on his work.
Moshe Adler at Columbia University wrote a research response to Chetty's paper for NEPC. This provoked a response from Chetty et al, which provoked yet another response from Adler. You can read the whole conversation here, but I'll warn you right now that you're not going to just scan it over lunch.
Meanwhile, you'll recall that the American Statistical Association came out pretty strongly opposed to VAM, which also put them in the position of being critical-- directly and indirectly-- of Chetty. Chetty et al took it upon themselves to deliver the ASA a lesson in statistical analyses ("I will keep my mouth shut because these people are authorities in areas outside my expertise," is apparently really hard for economists to say) which led to a conversation recounted here.
What do the scholarly and expert critics say?
To begin with, the study has a somewhat checkered publication history, debuting as news blurbs in 2010 and making its way up to publication in a non-peer-reviewed journal, then to republishing as two articles, then in a peer-reviewed journal. That history, along with many criticisms of the study, can be found here at Vamboozled, the blog of Audrey Amrein-Beardsley (the blog is a wealth of resources about all things Vam).
Many criticize Chetty's methodology. Adler's critique suggests that Chetty may have fudged some numbers, dis-included some data, and ignored previous research that didn't fit his framework. Amrein-Beardsley (and others) accuse Chetty of ignoring context of the data. Many critics suggest that Chetty is trying to make a mountain out of a molehill.
You can chase scholarly links all day long, though the NEPC link to Moshe's work and simply typing "Chetty" into the VAMboozled search box will provide more than enough reading for an afternoon. Or two.
So how much of Chetty's work is bunk?
I'm going to go with "most of it."
Chetty's idea was to link VAM measures to later success-- to be able to say, "Look! High-VAM teachers grow successful students." There are several problems with this.
First, studies of VAM-based teacher effectiveness always seem to descend into the same tautology. Use test scores to measure VAM. Use VAM to id the best teachers. Check to see if VAM-certified teachers raise test scores. Strip out the fancy language and funky math and you're left with a fairly simple tautology-- "Teachers who get students to have high test scores tend to get students to have high test scores." This is no more insightful or useful than research to show that bald men tend to be bald.
Second, Chetty doesn't seem to distinguish between correlation and causation. His results seem to scream for that consideration-- six year olds who do better on tests don't grow into twelve year olds who do better tests, but they do grow into twenty-eight olds who make more money. I'm no economist, but to me, the yawning gulf between the alleged cause and the supposed effect leaves enough room for a truckload of other possible causes. This holds together just about as well as "because I buried a toad under a full moon a year ago, I met my true love today."
And as it turns out, an explanation is readily available. We know who does better on standardized tests-- the children of high income families. We know who's more likely to get better-paying jobs as adults-- the children of high income families. It seems highly probable that the conclusion to be drawn from Chetty's research is, "Children of higher-income families do better on tests and get higher-paying jobs."
Chetty himself tried to plug that last hole, with research about economic mobility that concluded that it's not any worse than it was a decade ago-- but it's still pretty lousy. Chetty et al also insist that the students were distributed across the classrooms in completely random fashion. This strikes many as an assumption without foundation.
Put another way-- a mediocre teacher with a classroom full of rich kids who test well would earn a high VAM and those well-heeled students would still go on to have well-paying jobs, and nothing in Chetty's model would ever reveal that Mr. McMediocre was less than awesome.
There are other detail-inhabiting devils. The "big difference" in future earnings seems to vary according to which draft of the report we're looking at, and Chetty only claims them as far as the students turning twenty-eight-- the "lifetime earnings" claims are based on the assumption that the subjects will just keep getting the same raises for the rest of their lives that they got up until age twenty-eight. That is a heck of a bold assumption.
Chetty's work rests on the unproven assumption that VAM is not junk. VAM, in turn, rests on the assumption that 1) the Big Standardized Tests provide meaningful data and 2) that a magical formula can filter out all other factors related to student results on the BS Tests. Chetty's work also assumes that adult success is measured in monetary terms. And Chetty's work ignores the difference between correlation and causation, and instead makes a huge leap of faith to link cause and effect. I wold bet you dollars to donuts that we could perform research that would "prove" that eating a good breakfast when you're six, or having a nice pair of shoes when you're ten, can also be linked to higher-paying jobs in adulthood. As it is, we have "proof" that Nicholas Cage causes death by drowning, and that margarine causes divorce in Maine.
Chetty's work is not going to go away because it's sexy, it's simple, and it supports a whole host of policy ideas that people are already trying to push. But it is proof positive that just because somebody teaches at Harvard and wins awards, that doesn't mean they can't produce "research" that is absolute baloney.
Wednesday, March 11, 2015
Teaching to the Test Is not Teaching
"Teaching to the test" is an oft-repeated phrase these days. We discuss it a great deal in education because A) we're doing it more than ever and B) everyone knows we're not supposed to.

Testing should follow instruction, both sequentially and conceptually. Anything else is backwards pedagogy, an educational cart before the instructional horse. We begin instructional design by setting our goals for the unit, and then we develop our design by asking, "How can I best teach that, and how will I best determine whether or not the students learned it." This is not the same as setting out to teach students how to pass the test.
Many interpret "teach to the test" as simply drill and kill, with an emphasis on the actual questions that will be on the test, a more complicated version of flat out cheating. But in the era of test-driven accountability and our lousy high-stakes standardized tests, it's not that simple. There are other, more subtle but equally time-wastey means of teaching to a test.
Teaching to the test means instructing students in artificial, inauthentic tasks that they will find nowhere in the world but on a standardized test. The PARCC practice test involves a group of several possible ideas that one might find in the reading selection. The testee must click and drag the correct ideas into a box, and then select, click and drag the correct details from another list into boxes next to the boxes from the first part of the question. A teacher who is depending on those student test scores would be crazy not to do a few units on "How to answer these weird computer questions that you'll have on the PARCC."
Teaching to the test means teaching students how to navigate gotcha questions. Students now need to learn about distractors and the sorts of deliberate wrong answers that tests will throw at them in an attempt to trick them into choosing incorrectly. Never mind close reading the selection-- students need to close read the question and answers in order to discern what traps the test writers have laid.
Teaching to the test means teaching students how to write test-style essay answers. This does not involve doing what is generally considered Good Writing anywhere but in Testland. Test essay writing means recycle the prompt, use big words, and never, ever, get distracted by what you actually think. Test-style writing means figuring out what the test writers want you to say. Then say it.
Teaching to the test means preparing students for one narrow task, like teaching a chocolate lab to fetch. It is not so much teaching as training. It is not the work we signed up for as teachers, but it has become the work we are judged by.
Some make the argument that if we simply teach our students to be awesome, they will be able to transfer that awesomeness directly to the Big Standardized Test. This is like arguing that if we want to teach someone how to get from California to Ohio, a good test for that would be to demand that he shows up at the Meister Road entrance to Willow Park in Lorain without using the internet and riding on horseback while playing "Dixie" on an electronic kazoo.
Teaching to the test is not good pedagogy; good pedagogy is teaching the student and finding a way to let the student show you what she knows.
Originally posted at View for the Cheap Seats
Testing should follow instruction, both sequentially and conceptually. Anything else is backwards pedagogy, an educational cart before the instructional horse. We begin instructional design by setting our goals for the unit, and then we develop our design by asking, "How can I best teach that, and how will I best determine whether or not the students learned it." This is not the same as setting out to teach students how to pass the test.
Many interpret "teach to the test" as simply drill and kill, with an emphasis on the actual questions that will be on the test, a more complicated version of flat out cheating. But in the era of test-driven accountability and our lousy high-stakes standardized tests, it's not that simple. There are other, more subtle but equally time-wastey means of teaching to a test.
Teaching to the test means instructing students in artificial, inauthentic tasks that they will find nowhere in the world but on a standardized test. The PARCC practice test involves a group of several possible ideas that one might find in the reading selection. The testee must click and drag the correct ideas into a box, and then select, click and drag the correct details from another list into boxes next to the boxes from the first part of the question. A teacher who is depending on those student test scores would be crazy not to do a few units on "How to answer these weird computer questions that you'll have on the PARCC."
Teaching to the test means teaching students how to navigate gotcha questions. Students now need to learn about distractors and the sorts of deliberate wrong answers that tests will throw at them in an attempt to trick them into choosing incorrectly. Never mind close reading the selection-- students need to close read the question and answers in order to discern what traps the test writers have laid.
Teaching to the test means teaching students how to write test-style essay answers. This does not involve doing what is generally considered Good Writing anywhere but in Testland. Test essay writing means recycle the prompt, use big words, and never, ever, get distracted by what you actually think. Test-style writing means figuring out what the test writers want you to say. Then say it.
Teaching to the test means preparing students for one narrow task, like teaching a chocolate lab to fetch. It is not so much teaching as training. It is not the work we signed up for as teachers, but it has become the work we are judged by.
Some make the argument that if we simply teach our students to be awesome, they will be able to transfer that awesomeness directly to the Big Standardized Test. This is like arguing that if we want to teach someone how to get from California to Ohio, a good test for that would be to demand that he shows up at the Meister Road entrance to Willow Park in Lorain without using the internet and riding on horseback while playing "Dixie" on an electronic kazoo.
Teaching to the test is not good pedagogy; good pedagogy is teaching the student and finding a way to let the student show you what she knows.
Originally posted at View for the Cheap Seats
Super Slaps School Board Into Submission
Last night was the night for the Ken-Ton School Board and their president Bob Dana to take a stand against the test-and-bully policies of New York State. Faced with an extremely reluctant superintendent, the board blinked.
On Monday, I reported that the Kenmore-Town of Tonawanda School District, located a bit north of Buffalo, NY, was going to consider two resolutions-- one demanding that NY's teacher evaluation system be de-coupled from testing and the other demanding that Governor Cuomo stop holding everyone's money hostage. The "or else" was that the district would stop giving the test and counting it in teacher evaluations. Superintendent Dawn Mirand released a statement expressing her opposition to the move. The statement was pretty clear, but just in case there were any doubts, she reportedly made herself even clearer at last night's board meeting.
Joseph Popiolkowski had the story for this morning's Buffalo News:
"If the district’s state aid, which is currently 32 percent of its budget, or about $50 million, was withheld by the state as punishment, that would result in a 71 percent tax increase, she said. The average home assessed at $100,000 would see a $1,500 tax increase, “or massive layoffs would have to take place,” she [Mirand] said.
On top of that, board members could be removed from office and teachers who refused to administer the test might lose their certification. Furthermore, fire might rain from the sky, dogs and cats living together, mass hysteria.
Mirand just wants everyone to be aware of the risks.
You can see from coverage by tv station WKBW that the meeting pulled in a double-full house of community people, and that's a double-full house of people who were vocally in favor of standing up to Governor Cuomo. One parent in the newscast compares the action to taking a stand for civil rights.
Ken-Ton is one of the districts in NY that took a financial hit under the Gap Elimination Adjustment, which has oddly enough created budget gaps in many districts-- in Ken-Ton the cost has been about $40 million.
Mirand has only been in place since May of 2014. While she is clearly not one of those heroic warrior superintendents standing up to reformy nonsense, she is an actual educator, who started out as a teacher and has worked her way up in the region. Bob Dana was president when the board hired her, and he expressed enthusiasm for her at the time. She's having one fun first year.
Other board members range from firmly in Dana's corner to slightly apprehensive, and since the resolutions have only been out there for a few days, several would like a chance to finish thinking things through. The board has also invoked that old stand-by of nervous politicians everywhere-- the waiting period to get a more community input.
The resolutions are tabled until the April meeting of the board. In the meantime, you can bet that there will be some spirited conversing in the Ken-Ton school district.
On Monday, I reported that the Kenmore-Town of Tonawanda School District, located a bit north of Buffalo, NY, was going to consider two resolutions-- one demanding that NY's teacher evaluation system be de-coupled from testing and the other demanding that Governor Cuomo stop holding everyone's money hostage. The "or else" was that the district would stop giving the test and counting it in teacher evaluations. Superintendent Dawn Mirand released a statement expressing her opposition to the move. The statement was pretty clear, but just in case there were any doubts, she reportedly made herself even clearer at last night's board meeting.
Joseph Popiolkowski had the story for this morning's Buffalo News:
"If the district’s state aid, which is currently 32 percent of its budget, or about $50 million, was withheld by the state as punishment, that would result in a 71 percent tax increase, she said. The average home assessed at $100,000 would see a $1,500 tax increase, “or massive layoffs would have to take place,” she [Mirand] said.
On top of that, board members could be removed from office and teachers who refused to administer the test might lose their certification. Furthermore, fire might rain from the sky, dogs and cats living together, mass hysteria.
Mirand just wants everyone to be aware of the risks.
You can see from coverage by tv station WKBW that the meeting pulled in a double-full house of community people, and that's a double-full house of people who were vocally in favor of standing up to Governor Cuomo. One parent in the newscast compares the action to taking a stand for civil rights.
Ken-Ton is one of the districts in NY that took a financial hit under the Gap Elimination Adjustment, which has oddly enough created budget gaps in many districts-- in Ken-Ton the cost has been about $40 million.
Mirand has only been in place since May of 2014. While she is clearly not one of those heroic warrior superintendents standing up to reformy nonsense, she is an actual educator, who started out as a teacher and has worked her way up in the region. Bob Dana was president when the board hired her, and he expressed enthusiasm for her at the time. She's having one fun first year.
Other board members range from firmly in Dana's corner to slightly apprehensive, and since the resolutions have only been out there for a few days, several would like a chance to finish thinking things through. The board has also invoked that old stand-by of nervous politicians everywhere-- the waiting period to get a more community input.
The resolutions are tabled until the April meeting of the board. In the meantime, you can bet that there will be some spirited conversing in the Ken-Ton school district.
Subscribe to:
Posts (Atom)