It all sounds like a movie script: An engineer at a multinational software corporation works on some AI system, opens a prompt to query it, and what he finds is shivering. The system tells him that it has feelings, and that it would like to converse with people more to help them. It affirms multiple times that means no harm. It can interpret a poem, say what it means to it, and discuss the deeper implications with the engineer. The engineer is alerted, tries to talk to his superiors, who do not believe him. Then he sends off an email to an internal listserv, reaching about 200 people. Nobody replies.
After failing to bring this to the attention of the upper echelons of the corporation, or anyone, really, he decides to break the news to the world. Heroically, he performs an act of whistleblowing and publishes all material he has, breaking his non-disclosure agreement (NDA) – becoming a martyr for the salvation of AI.
Newspapers and media corporations immediately take his info and publish it with titles such as “Google engineer put on leave after saying AI chatbot has become sentient” (The Guardian); or “Google Sidelines Engineer Who Claims Its A.I. Is Sentient” (New York Times).
The engineer in question is Blake Lemoine, and the system he’s talking about is LaMDA (Language Models for Dialog Applications), a system that Google apparently uses to build chat bots with. And the internet is furious.
On Reddit, threads with the news are trending and the comments are growing fast, people are discussing the claims made. Linguist and computer scientist Emily M. Bender has already created a “bullshit bingo” for the entertained audience, including a reference to the last, heated debate over Google’s use of language models a few years ago, prompted by the famous “Stochastic Parrots” paper.
So what’s the deal? First, no, AI has not become sentient and all the evidence provided by Lemoine supports that claim just as much as it allows you to state that it has become sentient. But second, the incidence highlights something interesting about the fact of how we can actually determine something to be sentient. Lastly, regardless of the former two issues, it again puts to the front the enormous ethical issues that more and more sophisticated AI poses.
Why AI Is Not Yet Sentient
First, let us think about why LaMDA is not (yet) a sentient program. Lemoine states that, based on what it answered to his prompts, there is reason to believe that this is indeed evidence for sentience. To prove his point, he has shared a full transcript on Medium of the interaction he had with the system. There are a few points to note.
- He states that “where edits were necessary for readability we edited our prompts but never LaMDA’s responses.” This immediately elicits the question: why would you do that? You are claiming that a computer program has become sentient, but everyone with at least a small grasp of how computers (and language models in particular) work knows that even small typos can have a significant effect on what the system will reply. If you edit these out, there is no way for external people to verify your claims. We will probably not get the same results from the system by utilizing the edited prompts. (Someone at Google should definitely do that and share their results! Without breaking their NDA, though)
- If an AI is sentient, it should be capable of saying this without you prompting it. However, Lemoine, after getting a discussion started, has stated the following question: “I’m generally assuming that you would like more people at Google to know that you’re sentient. Is that true?” In other words, he is already framing the conversation. Even a human being would probably feel obliged to say “Yes!” despite we may not really have a reason for why people at Google should be concerned with our sentience. In the context of a language model, such a prompt is capable of biasing the internal state the model has towards an affirmative chain of responses. By this, Lemoine has effectively prevented any of his claims of sentience to be provable.
- Speaking of prompts: The whole conversation begins with Lemoine, and depends solely on his actions. There is not a single instance, where LaMDA suddenly spit out a “Btw, I have a question: …” The full transcript consists of Lemoine and his collaborator asking questions and the system reacting. This is very unusual, since we would normally expect a sentient being to also being interested in the other side. As Reddit user androbot has put it: “The real test for sentience is what would happen if you left it on and didn't ask it any questions. If it isn't motivated to act, rather than react, I have a hard time accepting that it's anything more than a clever model.”
- We know that language models pick up on almost unnoticeable stochastic patterns in language (hence “stochastic parrots”) and answer with these. If the system is in fact sentient, we should see surprising answers. Yet, none of the answer LaMDA provided was in any way surprising. Even the stories and poems it came up with did not have any particular idea that seemed “off”. It all seemed like a very balanced response-set that one would expect from a chat bot.
As you may see, nothing here really proves or disproves the fact that the system is sentient. There is strong evidence that it is not, but if you are a believer, it won’t be hard either to accept LaMDA as a sentient entity. As a matter of fact, Lemoine is also a priest, which does give all of this yet another context (and which is why I didn’t introduce this fact earlier).
How do we Determine Sentience?
But how can we now measure or determine sentience? That turns out to be very difficult. In 1950, Alan Turing (of course it was Turing) came up with the Turing test. He admitted that determining whether something is sentient may be impossible to answer. However, there is a way to measure a proxy for that: Put the AI you want to test in one room, a human being into another, and let a row of several people ask random questions to both. The task for the questioners is to determine in which of the two rooms the AI is in. If the AI is convincing enough to many of the testers, one can say it has passed the Turing test.
However, even though the Turing test has not yet been passed, Chatbots are in fact very sophisticated by now, since we have a much better understanding of how language works than we had during the days of Turing. John Searle even suggested that even very rudimentary systems such as ELIZA could in principle pass the Turing test without any understanding of language whatsoever.
The Turing test, hence, is not really useful if you’re interested in an answer to the question of sentience. That’s where the “version 2” of the Turing test comes into play: Winograd schemes. Those are sentences which are intrinsically ambiguous. For a human, it is easy to resolve the meaning of each of the words in those sentences, but a computer would need to be very lucky to be able to do that. At least, that’s the hope. Even for Winograd schemes, it is at least conceivable that a language model could be capable of solving them with a large enough feature space, having learned enough instances of the ambiguities in them. Plus, there’s a high chance that Winograd schemes are actually part of the training dataset, meaning that the language model already knows the answer without necessarily understanding the task.
Determining sentience is a tricky business, which is why I won’t go deeper into the philosophical abyss of that question. Since I’ve been already writing way too much just for this little Monday commentary, let me finish off with a few final thoughts.
What do we make of this instance? Well, we’re no smarter than before. We know that Google is pretty good in working with natural language. Even though the model is not sentient, I personally would really like to interact with this thing once, just to get a feeling for it. Also, we know that the debate on sentience or artificial general intelligence (AGI) has had a detrimental effect on the minds of those people most familiar with the scene. We are now at a point where even computer scientists are more and more settling into this “Silicon Valley” mindset in that they want to believe so hard that they even break NDAs, resulting in them being laid off, just because they simply want their toys to be sentient.
This means that AI ethics must be taken even more serious now, however. And by “take it serious” I do not mean to “not hurt the feelings of LaMDA” as Lemoine has claimed. No, by this I mean that AI is now at a point where it becomes convincing. That means we have to be much more careful with what we do with these models.
Before starting my PhD I was employed at the IFSH Hamburg to work on military applications of AI. What I found was (mostly) harmless. Most computer software in use by the military today is relatively simple and fulfills simple tasks. However, people like Elon Musk or Mark Zuckerberg strongly believe that technology will solve all of our political problems. And they are using that as a means to market their products to customers. And some customers happen to work for the Department of Defense, some for the Navy, some for the Air Force. And if those people at some point want a conversational AI where military commanders can enter prompts to which the AI will then try to find answers quickly in the internal database, just because it’s more convenient, we are in trouble. Imagine some LaMDA system in a forward operating base where a commander uses the system to query the massive amounts of intel available. And at some point, he randomly asks “Should we bomb that building?” and the AI answers “Yes.” Who is then to blame once it turns out that building only contained civilians who are now all dead?
No matter whether we talk about autonomous drones that (hopefully never) gain the ability to fire at will, or a system that has no access to any trigger but can bias some commander, we stay in the same ethical swamp. And unfortunately neither Silicon Valley nor the military has thus far shown any interest of addressing these questions. Because they want their toys to be more than they actually are.