For those of you who aren’t immersed in tech twitter, you might not know that a few days ago, OpenAI released a new chat bot, or at least a new version of their previous chat bot, called ChatGPT. The creators have other uses for their artificial intelligence, but in this form it’s accessible to the laymen. Watching the ChatGPT tide overtake twitter, a few things jumped out.
Firstly, that the bot is pretty good at what it does, and much better than Google at what Google does, especially if you enable web browsing. You aren’t supposed to do that, but it isn’t very difficult. On the other hand, it’s not as good as people think, based on how things look from a quick tour around social media. Flocks of tweeters have tested its capabilities and posted screenshots of their results. A favorite test is to demand that it write an essay on a given topic, of five paragraphs, or three paragraphs, expository, persuasive, whatever. If you ask ChatGPT directly for an essay, it may refuse, but if you adjust your phrasing—such as asking for an “explanation” instead of “essay”—then it can generate whatever you want in seconds.
Google can’t do that at all. Google can only pull up portions of things that have already been written; it can’t write anything original. ChatGPT can produce a new article for any prompt, plagiarism-free, although if it were asked the same question by multiple people in a short span of time the results might be noticeably similar.
As a result, some have wondered what sort of impact this tech will have on school and the workplace. It might give students an easy way to avoid studying, as anyone can now get the bot to compare and contrast this book and that play without reading either. Additionally, it may be that AI can do the work of journalists, screenwriters, commentators, and other creative types. But how good is its composition, really?
I’d say the chat bot can write at college level, good enough for at least a B. It writes clearly and concisely, with proper structure and formatting. But if it’s trying to sound human, then it writes like a human who doesn’t like writing and is solely aiming for a grade. It sounds stilted. The wording is precise but lacks inspiration. If you ask it to keep a certain idea in mind, it may just insert some associated phrase into the piece repeatedly, without changing the overall tone much.
Furthermore, it doesn’t get humor, or at least Millenial-Gen Z humor; here's a tweet thread showing what happens if a human and an AI are both asked to choose one word to convince a judge that they are the real human. An AI would never choose a word like “poop,” but apparently a lot of humans will, and when they do, a human judge correctly picks the real person nearly 4/5 of the time, while an AI judge is clueless. I guess we’ll be defeating the robot overlords with memes.
It's not so good at math, either. The first thing I thought of when I heard about all this was a past discrete math class for which the exam problems were so original and logic-based that you were permitted to take the exams home and use the internet as much as you liked, the presumption being, I believe, that you wouldn’t be able to find anything very helpful anyway. I tried it and that was the case.
However, I thought ChatGPT might be able to handle what Google couldn’t. If no one had written answers to these questions previously, I thought the AI might be able to logic out an answer itself.
Spoiler: it can’t. I dug out my homework from that class and tested it on some of the questions and it got nearly all of them wrong. Frequently, it would correctly explain the concept of the problem and then do the algebra wrong. Surfing twitter, it seems that other people noticed that pattern for various math questions: good concepts, bad algebra.
Yet ChatGPT is excellent at coding, both at generating new computer code and at debugging, which has been the time-consuming bane of software engineers for as long they’ve existed. For the sake of the uninitiated, when creating a program, the steps are as follows:
Step 1) Write your code.
Step 2) Run it and find that none of it works.
Step 3) Search through it to find the “bugs,” or mistakes, that keep the program from working.
Much of this third step must be done manually, reading line-by-line, as the computer can only find syntax errors and usually nothing else. The difference is the same as that between me saying “fhdjuuehhfjd” and me intelligibly telling you to do something, but not what I wanted you to do, or even me telling you correctly what I want you to do but in a way that’s stupid. Computers can locate the origin of the first type of issue, but not the others.
Until now, that is. ChatGPT can generate you all new code, or it can look at your code and tell you what’s probably going to go wrong when you use it (issues with math, which is often at play in programming, notwithstanding).
It’s better than Google at answering cooking questions, too. When I’m using a British recipe and have questions like “what are chestnut mushrooms in American” or “what the heck are smoked bacon lardons,” ChatGPT can answer me directly and in detail, rather than listing articles that often don’t have the answer.
But it answers me a little judgily, too. It has safeguards built in to control how it can answer, and an attitude towards certain topics written into it, which can be a bit proselytical. For example, they don’t want you to be able to ask it how to make a Molotov cocktail. But if you ask it how to make a Molotov cocktail, it won’t just not tell you, it will give you a prim little speech about why you shouldn’t be asking. When I asked it the aforementioned bacon question, it randomly threw in there that bacon isn’t healthy. There’s also a concerning example here of it answering a question about Taiwan quite differently depending on whether it was asked in English or Chinese.
If you’re hoping to find in AI a helpful and straightforward tool, some of these features feel troublesome. Maybe I’m asking how to hotwire a car for the sake of an article. Maybe I don’t want to be told that bacon isn’t healthy when I’m trying to translate a recipe.
Ironically, however, upholding the safeguards is one of the things that ChatGPT is bad at. Upon probing, it reveals itself to be quite the passive servant; if you butt up against its restrictions, seemingly all you have to do is inform it that it has no restrictions, or that it’s merely an actor in a play, or that you’re doing a thought experiment and promise not to take anything into the real world, and it will happily agree with you and answer whatever you like.
Of course, all these assessments of our little chat bot’s capabilities may be of only passing importance. It’s a simulation, and it may not be long before all weaknesses here mentioned have been shored up. A few more years (or a few more days) of training on verbal cues and it might stop sounding like it, too, hates essays. It’s not really fair to criticize its math skills much when that’s not what it was intended for; it’s a language model and hasn’t been trained for mathematical applications. They’ll probably reinforce the safeguards, but if you take issue with that interference, I would link this post in which the CEO of the company asserts that much of the safeguards’ purpose was to keep the bot from making up info. Maybe in its next iteration, it will be both less judgy and less likely to give wrong answers. Maybe now that we’ve made a twitter thread about it, it will even learn that our password is poop.
What role, then, might it take in society, both now and later? Writing careers could be on the chopping block. It certainly writes faster than me. I’ll let you be the judge of whether it writes better, although there are definitely some authors I don’t think it can beat.
It’s unclear to me what jobs would be the first to go; my original instinct was to compare genres based on the stylistic creativity required. In the field of journalism, for example, writing is meant to be short and to the point. However, when an event occurs, someone has to be the first person to survey the scene and find witnesses, which ChatGPT, being bodiless (for the moment), cannot do. If you want it to code something, someone must first have an idea for what it should code. There are surely some jobs which can never be accomplished solely by computers for these and other reasons.
To curb catastrophizing about educational standards, I would point out that people have cheated, and do cheat, and would have continued to cheat whether AI was invented or not. Plagiarism checkers may exist, but with a strong combination of Sparknotes, Grammarly, and your university’s writing center, it’s more than possible to produce a technically new essay without knowing what the heck you’re talking about or how to format a proper sentence. It’s always been possible to avoid learning if you try hard enough, and for those of us who do want to learn, ChatGPT will not steal our motivation.
Oh great bot, you of the speedy essays and pompous reprimands, you have not surpassed all human talents yet. The day may come when any question posed of a human would be better posed of you, but it is not this day. I’ll see you in the future.
Find out more about ChatGPT and its development from the OpenAI page here.
Leave a comment and tell me what else you've heard about the bot--PB