This summer the University of Delaware was happy to unveil yet more research on yet another attempt to argue that computer software has a place in writing instruction.
Being up front
As a high school English teacher, I've thought about this a great deal, and written about it on several occasions (here, here and here, for example). And mostly I think actual useful essay-grading computers are about as probable as unicorns dancing with chartreuse polar bears in fields of asparagus. We could safely label me "Predisposed to be Skeptical."
And yet I'm determined to have an open mind. But I've been down this road before, and I recognize the Big Red Flags when I see them.
Red Flag #1: Who's Paying for This?
Assistant professor Joshus Wilson, from UD's School of Education, set out to see if the software PEGWriting could be used not just to score student writing, but to inform and assist instruction throughout the year. Why would he want to look into that?
The software Wilson used is called PEGWriting (which stands for Project
Essay Grade Writing), based on work by the late education researcher
Ellis B. Page and sold by Measurement Incorporated, which supports
Wilson's research with indirect funding to the University.
So, the software maker paid for and perhaps commissioned this research. Just to be clear, the fact that there's no direct quid pro quo makes it worse-- if I'm counting on your funding to pay for the project I'm doing, the funding and the project can go away together and life can go on for the rest of my department. But if I'm doing research on your product over here and you're paying for, say, all our office furniture over there, the stakes are higher.
At any rate, this is clear built-in bias. Anything else?
Red Flag #2: You're Scoring What??!!
The software uses algorithms to measure more than 500 text-level
variables to yield scores and feedback regarding the following
characteristics of writing quality: idea development, organization,
style, word choice, sentence structure, and writing conventions such as
spelling and grammar.
First, I know you think it's impressive that it's measuring 500 variables ("text-level"-- as opposed to some other level?? Paper-level?), but it's not. It's like telling me that you have a vocabulary of 500 words. not so impressive, given the nature of language.
But beyond that-- PEGWriting wants to market itself as being tuned into six trait writing. I have no beef with the six traits-- I've used them myself for decades. And that's how I know that no software can possibly do what this software claims it can do.
Idea development? Really? I will bet you dollars to donuts that if I take my thesis statement ("Abe Lincoln was a great peacemaker") and develop it with absolute baloney support ("Lincoln helped bring peace by convincing Hitler to give up his siege of the Alamo"), the software will think that's swell. The software cannot read or understand ideas. It cannot assess this trait. Nor can it assess organization beyond looking for recycled prompts and transition words (Next, Furthermore, On the other hand). Nor can it have the slightest idea whether my word choices were best suited to the ideas in my essay. Any evaluation of sentence structure or style will be restricted to simply counting up types of sentences that it can (mostly) identify based on structure words and punctuation.
But robo-writing software always hits the same barrier-- the basic unit of writing is ideas, and if the software could understand ideas, the software developers would have created artificial intelligence and they'd have far more interesting things to spend time on than student writing.
No consideration of this topic can be complete without invoking my hero Les Perelman, who has made a career out of making essay-grading software look stupid. He has demonstrated over and over and over again that software does not know the difference between good writing and gibberish.
PEGWriting does enlist the teacher's help in scoring for textual evidence and content accuracy, so that's better than simply claiming the computer can do it.
Hey! A Non-red Flag
The idea is to give teachers useful diagnostic information on each
writer and give them more time to address problems and assist students
with things no machine can comprehend
This is a True Thing. I have my students do some fairly low-brain diagnostics on their own writing-- how many forms of "be," how many sentences, what sentence lengths, etc. Software could totally do this work and the information, when run through a human brain, can be useful, particularly in helping writings identify their tendencies and weak areas. That is the kind of thing software could totally do.
Red Flag #3: Know Your Research
Researchers have established that computer models are highly predictive
of how humans would have scored a given piece of writing, Wilson said,
and efforts to increase that accuracy continue.
Well, no. You can look at my short take on Perelman's work or the whole piece, but the bottom line is that the research that Wilson is most likely referring to is thoroughly unconvincing and shot full of huge holes.
Red Flag #4: That's Not a Good Thing
Wilson's research involved handing out free copies of PEGWriting to third, fourth and fifth grade classes.
Teachers said students liked the "game" aspects of the automated writing
environment and that seemed to increase their motivation to write quite
a bit. Because they got immediate scores on their writing, many worked
to raise their scores by correcting errors and revising their work over
and over.
Um, no. That's not entirely a good thing. I'll give you the positive side effect of making writing seem more fun than chorelike, but otherwise, the idea of having students learn that writing is like a game where you mess with words to score points-- well, that might prepare them for careers as internet trolls, but as with most bad writing instruction, it takes directly away from the actual point and purpose of writing, which is to say what you have to or want to say in the clearest way possible. Anything that reduces writing a mechanical activity completely divorced from the acual meaningful expressions of live humans is a Bad Thing. What could be worse than the approach described above? Oh, I know--
That same quick score produced discouragement for other students,
though, teachers said, when they received low scores and could not
figure out how to raise them no matter how hard they worked.
Emphasis mine. Because that "hard work" will be composed entirely of trying to mechanically manipulate pieces parts of written stuff. It will be no more about learning to write well that Super Mario Brothers is about learning how to talk to girls.
Is It All Bad?
The software does seem to offer some useful features, including an interactivity with both teacher and peer reviewers that could be handy. And I confess that I find the idea of a writing instruction platform on line to hold onto all the pieces parts of writing instruction.
Meanwhile, Wilson is looking for "efficiencies" in an approach that does seem to suggest some evolution in the marketing approach of software companies, as well as clarifying the teacher role in collaboration with the software. The old approach was to present software that would do everything for you; this research seems more focused on figuring out ways in which the software can help with instruction by saving time on things that software can actually do.
The bad news for the software manufacturers is that the answer to "what parts of writing assessment can software actually do" is "not many." I do think it's possible to create useful software, but unfortunately given how many teachers and administrators looking for a quick and easy way to writing instruction, it's unlikely that vendors won't keep trying to cash in on that market with crappy products that try to do too much that computers cannot do, resulting in more of these crappy pieces of crappity crap.
But But But
You may say that I'm quibbling about a level of writerly sophistication that only comes into play in older students, and that as long as we're just talking about elementary students, this sort of mechanical trained-monkey approach is fine.
I vehemently disagree.
The most important thing that young students learn about writing is what it is, what it's for, and how to engage with it. When we teach young students that writing is a series of mechanical tasks performed to make some teacher or software happy, we do huge long-term damage, and we turn potential writers into people who don't even know what writing is. From day one, we should be teaching them that writing is a cool way to communicate what you think and feel to other human beings, and that it starts inside your own brain and heart, not in some set of instructions.
Writing instruction done well is powerful, because young humans who are aching to be heard can discover a way to put their voice into the world, to be heard and responded to by other human beings. There is no similar excitement to be found in gaming a computer.
Here's the deal. When it comes to assessing writing via the standardized assessments, the human scorers are given so little time--maybe 20 seconds per student sample--that they are reduced to this machine level. And the machines can beat them at that. Plus, machines are impervious to the supervisor saying we're giving out too many passing scores--knock the essays down several pegs. Therefore, machines beat people if and only if we are dealing with standardized assessments. Sigh, which seems to be all we care about anymore, thus, the robo-scorer is the go-to choice. Can't wait for the robo-teacher, but perhaps by then we will have robo-students, so the machines can assess what the machines taught to machines. I think I have the premise for a new dystopian novel.
ReplyDeleteHilarious.
DeleteExactly. The point of writing, like all language and language arts, is "to communicate what you think and feel to other human beings." You're successful when other human beings read it and understand what you think and feel. A machine can't give you that. Writing class is to help you refine your writing so that people understand better what you mean. And it's called "language arts" because it can be an art. Machines don't think and feel, and they don't understand art.
ReplyDelete