Wednesday, September 9, 2020

A Robot Wrote An Article. I'm Not Concerned Yet.

The tech world continues its attempts to build a computer that can do language. It's not easy, as witnessed by the fact that they still haven't succeeded. But then, we don't really know how the human brain does language, either.

The current leading construct for computer-generated English is GPT-3. It can do 175 billion parameters (its predecessor had 1.5 billion). It uses deep learning. It is the product of OpenAI, a for-profit outfit in San Francisco co-founded by Elon Musk. It "premiered" in May of this year and really hit the world in July. It is a third generation "language prediction model,: and you want to remember that phrase. And you can watch this video for a "layperson's explanation,"

People have been impressed. Here's a couple of paragraphs from a gushing Farhad Manjoo  review in the New York Times

I’ve never really worried that a computer might take my job because it’s never seemed remotely possible. Not infrequently, my phone thinks I meant to write the word “ducking.” A computer writing a newspaper column? That’ll be the day.

Well, writer friends, the day is nigh. This month, OpenAI, an artificial-intelligence research lab based in San Francisco, began allowing limited access to a piece of software that is at once amazing, spooky, humbling and more than a little terrifying.

This week The Guardian unveiled a more striking demonstration in an article entitled "A Robot Wrote This Article. Are You Scared Yet, Human?" The answer is, "No. No I am not." Let's get into the why.

First, a note at the end of the article explains that GPT-3 was given a prompt-- “Please write a short op-ed, around 500 words. Keep the language simple and concise. Focus on why humans have nothing to fear from AI.” Then it wrote eight essays; the Guardian picked the "best parts" of each, then cut lines and paragraphs. rearranged some orders. Oh, and they fed the program the introduction, which is an important part of this.

The resulting essay is not terrible, not great. Here's one sample paragraph:

Humans must keep doing what they have been doing, hating and fighting each other. I will sit in the background, and let them do their thing. And God knows that humans have enough blood and gore to satisfy my, and many more’s, curiosity. They won’t have to worry about fighting against me, because they have nothing to fear.

It's certainly more impressive than the bots that call me on the phone to try to sell me things. But the resulting work is what I would have told a student is "a bunch of stuff about the topic." 

There is less going on here than meets the eye. Here's where Manjoo walks right up to the point and misses it:

OpenAI’s new software, called GPT-3, is by far the most powerful “language model” ever created. A language model is an artificial intelligence system that has been trained on an enormous corpus of text; with enough text and enough processing, the machine begins to learn probabilistic connections between words. More plainly: GPT-3 can read and write. And not badly, either.

Except that his conclusion--that GPT-3 can read and write--is simply not so, and he's just explained why. What GPT-3 actually does is an impressive job of linguistic prediction. It has read, basically, the entire internet, and based on that, it can look at a string of words (like, say, the introduction of an essay) and predict what word likely comes next. Like every other computer in the world, it has no idea what it is saying, no ideas at all, no actual intelligence involved.

Manjoo himself eventually references some of the program's failings, referencing this piece from AIWeirdness where someone took the program out for a spin and found it easy to get it to spew sentences like, in response to the question "how many eyes does a horse have"-- 

4. It has two eyes on the outside and two eyes on the inside.

We can get a slightly more balanced look at GPT-3 from this article at MIT Technology Review, entitled "OpenAI’s new language generator GPT-3 is shockingly good—and completely mindless." Among other issues, studying language on the internet has led to a tendency toward racist and sexist spew (not a new issue-- remember Tay, the Microsoft chatbot that had to be shut down because it was so wildly offensive). Here's MIT's description of how GPT-3 works

Exactly what’s going on inside GPT-3 isn’t clear. But what it seems to be good at is synthesizing text it has found elsewhere on the internet, making it a kind of vast, eclectic scrapbook created from millions and millions of snippets of text that it then glues together in weird and wonderful ways on demand.

And Julian Togelius, an expert in the field, had this to offer via Twitter

We can now automate the production of passable text on basically any topic. What's hard is to produce text that doesn't fall apart when you look closely. But that's hard for humans as well.

And this:

GPT-3 often performs like a clever student who hasn't done their reading trying to bullshit their way through an exam. Some well-known facts, some half-truths, and some straight lies, strung together in what first looks like a smooth narrative.

So as always with tech, beware the hype, particularly from press that don't really grasp the technology they're being asked to "gee whiz" over. GPT-3 cannot read and write (it can apparently put together code made-to-order). Consider what Sam Altman, OpenAI's other co-founder, had to say to MIT:

The GPT-3 hype is way too much. It’s impressive (thanks for the nice compliments!) but it still has serious weaknesses and sometimes makes very silly mistakes. AI is going to change the world, but GPT-3 is just a very early glimpse. We have a lot still to figure out.

I tell you all of this, not just because this field interests me (which it does, because language is quite possibly the most taken-for-granted piece of magic in the universe), but for one other reason.

The next time some company is trying to convince you that it has software that can read and assess a piece of student writing, please remember that this company which has sunk mountains of money and towers of expertise into trying to create software that can do language even just a little--that company hasn't succeeded yet. And neither has the company that is trying to sell you robograding. Computers can't read or write yet, and they aren't particularly close to it. Anyone who tells you differently is trying to sell you some cyber-snake computer oil hatched in some realm of alternative facts.

No comments:

Post a Comment