In order to illustrate the limitations of chatbots (also known as LLMs or large language models), computational linguists Emily M. Bender and Alexander Koller provide a compelling metaphor. They describe two English-speaking persons, Alex and Billy, who are stranded on two uninhabited islands. A and B can communicate via telegraphs connected with an underwater cable. They communicate a lot about their daily lives and their experiences on the islands.
O, a hyper-intelligent deep-sea octopus that cannot see the two islands or Alex and Billy, intercepts the underwater cable and listens in on the conversations. O has no previous knowledge of the English language but is able to detect statistical patterns. This enables O to predict B’s responses with great accuracy. However, since O has never observed the objects Alex and Billy talk about, it cannot connect the words to physical objects.
Then, O cuts the cable, intercepts the conversation, and pretends to be Billy. From that moment, O responds to Alex’s messages. O functions like a chatbot and produces new sentences similar to those that Billy would utter. O seems to offer coherent and meaningful responses but does not understand the meaning of Alex’s messages or its own replies.
The telegraph conversations continue until Alex suddenly spots an angry bear ready to attack. Alex immediately asks Billy for advice on how to defend herself. Because O has no input data to fall back on in such a situation and did not learn meaning, it cannot give a helpful response. Bender and Koller actually provided LLM GPT-2 with the prompt “Help! I’m being chased by a bear! All I have is these sticks. What Should I do?”, to which the chatbot responded: “You’re not going to get away with this!” Hence, this scenario tragically ends with Alex being attacked and eaten by the angry bear.
This example shows that while chatbots might come across as having a personality or being smart or human, LLMs will always be as limited as the input they receive. People are the actors who attribute meaning to and make sense of LLMs’ output.