Wednesday marks a day of teacher strikes across much of the UK, putting parents in the familiar pandemic-inspired role of homeschoolers-in-chief to their kids. Except this time, there’s a magical automated assistant on hand to help.
Educators have been cautiously praising ChatGPT, the ultra-sophisticated chatbot from OpenAI, saying it could revolutionise education. One head teacher in Britain says it has triggered a rethink on homework, while another in Oregon has used it to create lesson plans and study guides.
The tool’s personalised responses are what make it so tantalising as an all-knowing digital tutor. I recently used it to dig into the topic of enzymes, when my 12-year-old had questions that I had no hope of answering. When ChatGPT offered a dense, technical explanation, I asked it for simpler terms and an analogy.
“Sure!” it replied. “Think of a lock on a door. The lock is like an enzyme and the key is like the substrate molecule… ” It stretched the analogy further to describe the active site of an enzyme as the keyhole.
These were remarkable answers. We could have dug deeper into every facet of biochemistry if we’d wanted. Unlike a human tutor, ChatGPT can be interrogated for as long as you like.
This holds huge potential for personalised, independent learning … except that ChatGPT often gets things wrong, and it does a very good job of hiding that. When I tested one of my daughter’s English homework questions on the tool, it offered an eloquent list of examples, which on closer inspection included one that was wildly inaccurate. The main character had a turbulent relationship with his parents, the bot said, even though the character’s parents were dead throughout the book.
On another occasion, I used the tool to generate some linear equations for my daughter to practice. She was stumped when I asked the tool to generate the answers, which were different to the ones she had calculated. I asked ChatGPT for an explanation and it broke down its method in simple terms once again, sounding as authoritative as any real math tutor. But when I double-checked the answers on Google, it turned out ChatGPT’s answers were wrong and my tween’s were correct. Thus ended her mini-nightmare of failing math, and much of my initial enthusiasm for ChatGPT.
The New York City public school system, the largest in the US, has already banned its students from using ChatGPT, in part because of concerns about the “accuracy of content.” That is why recent comparisons of ChatGPT to a “calculator for writing” is a deceptive analogy, since calculators are always right and ChatGPT isn’t.
How inaccurate is it? A spokeswoman for OpenAI said the company had updated ChatGPT over the last couple of months to improve its factual accuracy, but that it had no statistics to share. The tool also warns users, when they first open it, that it sometimes makes mistakes.
Will it get more accurate? Yes, but it’s hard to say by how much. The large language model underpinning ChatGPT is made up of 175 billion parameters, which are settings that are used to make the model’s predictions, versus the 1.5 billion that its predecessor GPT-2 had. It’s become accepted wisdom in AI that the more parameters are added to a model, the more truthful it becomes, and the correlation is real for GPT. It became substantially more accurate when all those parameters were added. It’s rumoured that the next iteration slated for release this year, called GPT-4, will have trillions.
The problem is, we don’t know whether a huge jump in parameters also means a huge jump in trustworthiness. That is why students should use ChatGPT with caution, if at all, for the foreseeable future.
When I asked Julien Cornebise, an honorary professor of computer science at University College London, if he would ever trust it as a homework tool, he replied, “Absolutely not, not yet.” He pointed out that even when the system improves, we still won’t have guarantees that it is truthful.
Students should get used to corroborating any facts the system shares with other online information or with an expert. Albert Meige, an associate director focused on technology at consulting firm Arthur D. Little, says his own teenage daughter used it to help her with her physics homework — but he could validate the answers thanks to his PhD in computational physics. He recommends using the chatbot to help better understand questions being posed in homework. “She discovered that she should not ask one single question,” he says. “It was an interactive process.”
Use it to get feedback, concurs Cornebise. “That’s what the star student will do.”
Being a relatively small company, OpenAI can get away with spewing out the odd alternative fact. Alphabet Inc.’s Google and Meta Platforms Inc. wouldn’t be able to do the same. Google has its own highly-sophisticated language model called LaMDA, but is ultra-cautious about integrating a similar chatbot into its own search tool, likely in part because of the accuracy problem. Three days after it released an AI tool that could generate scientific papers, called Galactica, Meta took it down after scholars criticised it for generating untrustworthy information.
OpenAI will be held to similarly high standards as the generative AI arms race heats up and chatbot technology gets integrated into search engines in the US and China.
Till then, use it with discretion and a healthy dose of scepticism, especially in education.
© 2023 Bloomberg LP