People are funny. We're constantly trying to understand and interpret the world around us. I live in a house with two black cats, and let me tell you, every time I see a black, bunched up sweater out of the corner of my eye, I think it's a cat.
It's not just the things we see. Sometimes we attribute more intelligence than might actually be there. Maybe you've seen the dogs on TikTok. They have these little buttons that say things like "walk" or "treat." They can push them to communicate some things with their owners, and their owners think they use them to communicate some pretty impressive things. But do the dogs know what they're saying?
Or perhaps you've heard the story of Clever Hans the horse, and he could do math. And not just like, simple math problems, really complicated ones, like, if the eighth day of the month falls on a Tuesday, what's the date of the following Friday? It's like, pretty impressive for a horse. Unfortunately, Hans wasn't doing math, but what he was doing was equally impressive. Hans had learned to watch the people in the room to tell when he should tap his hoof. So he communicated his answers by tapping his hoof. It turns out that if you know the answer to "if the eighth day of the month falls on a Tuesday, what's the date of the following Friday," you will subconsciously change your posture once the horse has given the correct 18 taps. So Hans couldn't do math, but he had learned to watch the people in the room who could do math, which, I mean, still pretty impressive for a horse. But this is an old picture, and we would not fall for Clever Hans today. Or would we?
Well, I work in AI, and let me tell you, things are wild. There have been multiple examples of people being completely convinced that AI understands them. In 2022, a Google engineer thought that Google’s AI was sentient. And you may have had a really human-like conversation with something like ChatGPT. But models we're training today are so much better than the models we had even five years ago. It really is remarkable.
So at this super crazy moment in time, let’s ask the super crazy question: Does AI understand us, or are we having our own Clever Hans moment?
Some philosophers think that computers will never understand language. To illustrate this, they developed something they call the Chinese room argument. In the Chinese room, there is a person, hypothetical person, who does not understand Chinese, but he has along with him a set of instructions that tell him how to respond in Chinese to any Chinese sentence. Here's how the Chinese room works. A piece of paper comes in through a slot in the door, has something written in Chinese on it. The person uses their instructions to figure out how to respond. They write the response down on a piece of paper and then send it back out through the door. To somebody who speaks Chinese, standing outside this room, it might seem like the person inside the room speaks Chinese. But we know they do not, because no knowledge of Chinese is required to follow the instructions. Performance on this task does not show that you know Chinese.
So what does that tell us about AI? Well, when you and I stand outside of the room, when we speak to one of these AIs like ChatGPT, we are the person standing outside the room. We're feeding in English sentences, we're getting English sentences back. It really looks like the models understand us. It really looks like they know English. But under the hood, these models are just following a set of instructions, albeit complex. How do we know if AI understands us?
To answer that question, let's go back to the Chinese room again. Let's say we have two Chinese rooms. In one Chinese room is somebody who actually speaks Chinese, and in the other room is our impostor. When the person who actually speaks Chinese gets a piece of paper that says something in Chinese in it, they can read it, no problem. But when our imposter gets it again, he has to use his set of instructions to figure out how to respond. From the outside, it might be impossible to distinguish these two rooms, but we know inside something really different is happening. To illustrate that, let's say inside the minds of our two people, inside of our two rooms, is a little scratch pad. And everything they have to remember in order to do this task has to be written on that little scratch pad. If we could see what was written on that scratch pad, we would be able to tell how different their approach to the task is. So though the input and the output of these two rooms might be exactly the same, the process of getting from input to output -- completely different.
So again, what does that tell us about AI? Again, if AI, even if it generates completely plausible dialogue, answers questions just like we would expect, it may still be an imposter of sorts. If we want to know if AI understands language like we do, we need to know what it's doing. We need to get inside to see what it's doing. Is it an imposter or not? We need to see its scratch pad, and we need to be able to compare it to the scratch pad of somebody who actually understands language. But like scratch pads in brains, that's not something we can actually see, right?
Well, it turns out that we can kind of see scratch pads in brains. Using something like fMRI or EEG, we can take what are like little snapshots of the brain while it’s reading. So have people read words or stories and then take pictures of their brain. And those brain images are like fuzzy, out-of-focus pictures of the scratch pad of the brain. They tell us a little bit about how the brain is processing and representing information while you read.
So here are three brain images taken while a person read the word "apartment," "house" and "celery." You can see just with your naked eye that the brain image for "apartment" and "house" are more similar to each other than they are to the brain image for "celery." And you know, of course that apartments and houses are more similar than they are to celery, just the words. So said another way, the brain uses its scratchpad when reading the words "apartment" and "house" in a way that's more similar than when you read the word "celery." The scratch pad tells us a little bit about how the brain represents the language. It's not a perfect picture of what the brain's doing, but it's good enough.
OK, so we have scratch pads for the brain. Now we need a scratch pad for AI. So inside a lot of AIs is a neural network. And inside of a neural network is a bunch of these little neurons. So here the neurons are like these little gray circles. And we would like to know what is the scratch pad of a neural network? Well, when we feed in a word into a neural network, each of the little neurons computes a number. Those little numbers I'm representing here with colors. So every neuron computes this little number, and those numbers tell us something about how the neural network is processing language. Taken together, all of those little circles paint us a picture of how the neural network is representing language, and they give us the scratch pad of the neural network.
OK, great. Now we have two scratch pads, one from the brain and one from AI. And we want to know: Is AI doing something like what the brain is doing? How can we test that?
Here's what researchers have come up with. We're going to train a new model. That new model is going to look at neural network scratch pad for a particular word and try to predict the brain scratch pad for the same word. We can do it, by the way, around two. So let's train a new model. It’s going to look at the neural network scratch pad for a particular word and try to predict the brain scratchpad. If the brain and AI are doing nothing alike, have nothing in common, we won't be able to do this prediction task. It won't be possible to predict one from the other.
So we've reached a fork in the road and you can probably tell I'm about to tell you one of two things. I’m going to tell you AI is amazing, or I'm going to tell you AI is an imposter. Researchers like me love to remind you that AI is nothing like the brain. And that is true. But could it also be the AI and the brain share something in common?
So we’ve done this scratch pad prediction task, and it turns out, 75 percent of the time the predicted neural network scratchpad for a particular word is more similar to the true neural network scratchpad for that word than it is to the neural network scratch pad for some other randomly chosen word -- 75 percent is much better than chance. What about for more complicated things, not just words, but sentences, even stories? Again, this scratch pad prediction task works. We’re able to predict the neural network scratch pad from the brain and vice versa. Amazing. So does that mean that neural networks and AI understand language just like we do? Well, truthfully, no. Though these scratch pad prediction tasks show above-chance accuracy, the underlying correlations are still pretty weak. And though neural networks are inspired by the brain, they don't have the same kind of structure and complexity that we see in the brain. Neural networks also don't exist in the world. A neural network has never opened a door or seen a sunset, heard a baby cry. Can a neural network that doesn't actually exist in the world, hasn't really experienced the world, really understand language about the world?
Still, these scratch pad prediction experiments have held up -- multiple brain imaging experiments, multiple neural networks. We've also found that as the neural networks get more accurate, they also start to use their scratch pad in a way that becomes more brain-like. And it's not just language. We've seen similar results in navigation and vision.
So AI is not doing exactly what the brain is doing, but it's not completely random either. So from where I sit, if we want to know if AI really understands language like we do, we need to get inside of the Chinese room. We need to know what the AI is doing, and we need to be able to compare that to what people are doing when they understand language.
AI is moving so fast. Today, I'm asking you, does AI understand language that might seem like a silly question in ten years. Or ten months.
(Laughter)
But one thing will remain true. We are meaning-making humans, and we are going to continue to look for meaning and interpret the world around us. And we will need to remember that if we only look at the input and output of AI, it's very easy to be fooled. We need to get inside of the metaphorical room of AI in order to see what's happening. It's what's inside the counts.
Thank you.
(Applause)