"Okay, I think I see where you went wrong..."
"Hmm. Can you explain to me why you took this step here...?"
"That's an interesting interpretation, but I think you might have overlooked this..."
There are so many ways in which generative language algorithms (marketed as AI) can't do the work of a teacher, some larger than other.
Some are pretty basic. The notion that AI can create lesson plans only makes sense if you think a good way to do lesson plans would be to have an assistant google the topic and then create a sort-of-summary of what they found.
But other obstacles are fairly huge.
Certainly there's a version of teaching that looks like this:
Student: Here's an answer.
Teacher: That's wrong. Try again.
Student: How about this?
Teacher: Still wrong. Try again.
For different sorts of content, there's a version like this.
Teacher: Do A, then B, then C, and you will get X.
Student: Um, I got Q somehow.
Teacher: Do A, then B, then C, and you will get X.
Student: I'm not so sure about the B part. Also, I got V this time.
Teacher: Do A, then B, then C, and you will get X.
The technical term for this kind of teaching is "poor" or even "bad." Also, "teaching via Khan Academy." This also applies to new AI-powered versions like Khanmigo, which tries to help by essentially directing you to a video that specifically shows you B. Or you can throw in "special interests" and the AI will "incorporate" references to your favorite hobby.
Part of the work is to try to get inside the students' head. It is not enough to assess whether the student has produced an answer that is right or wrong or sort-of-right, and it's certainly not enough to repeat some version of "Don't be wrong. Be right," over and over again. The job is to figure out where they may have stumbled, to see where they are in the vast territory of content and skills that we are helping them navigate.
Part of the work is watching students struggle, watching the cues that they have hit a rough spot, collecting data that reveals how they are trying to work their way through the material, sorting and sifting the clues into important and unimportant sets. Part of the work is thinking about how the students are thinking. Part of the work is looking at how certain soft intangibles (e.g. the Habits of Mind) play out as the student wrestles with the material.
Sure, the algorithm can "learn" cues that indicate certain mistakes in thinking (if you looked at 2 x 3 and got 5, you probably added instead of multiplied), but the more complex the task, the more varied the outcomes, and the more varied and unpredictable the outcome, the less capable AI is of dealing with it (e.g. how does Hamlet's character arc reflect his relationship with death). Does the student show an attempt to really come to grips with the materials, or are they just spitting out something that the AI would recognize as correct?
It all makes a difference. Is the student soooooo close, or just flailing blindly? Is the student really trying, or just coasting? Is the student making operational errors, or operating with flawed fundamentals? Part of the work is to try to assess the student process, but AI can only deal with the result, and the result that the student produces is often the least critical part of the learning process.
For young humans, the best learning requires relationship with another actual human being. Eventually humans learn to teach themselves, but that comes later. Until that day, small humans need interaction with another human, some being who can do more than simply present the "right" answer over and over again.