So two hundred thousand years ago, Homo sapiens, our ancestors, were sitting around smoky fire pits where hunted meat was cooked. They used rudimentary language, one-syllable words, a lot of body language, action sounds. But fifty thousand years ago, their language had become complex enough to tell stories. So we can only speculate what these stories were about, but we have circumstantial evidence.
Around the same time, after having sat thousands of years around fire pits, they got up, left Africa and colonized the entire planet. And according to the Human Genome Project, we are all their descendants. They had discovered that stories were ideal to activate communities, and they didn't need any distribution because they were viral.
Five thousand years ago, these oral tradition stories were written on clay tablets. Five hundred years ago, they were printed on Gutenberg presses, 60 years ago they became digital, 30 years ago they ended up on the internet, and three years ago, the internet was swallowed by insatiable AIs. And three weeks ago, the LLMs became multimodal. And there's a lot more modalities probably coming. We have no idea if this acceleration is going to slow down, it doesn't seem like it. So we all sort of fastened our seat belts and excitingly follow the newest things.
Now, from the moment that we started trying to teach machines human language, our idea of human language completely changed. We used to think that human language was a system of signs and sounds to convey meaning for the sake of communication. Now we think that was superficial. There is a deeper function in language. Language encodes world knowledge. Language encodes a model of the world.
When our babies are born, they are blank slates. We cannot put any information in DNA. The formatting only accepts genetic data. So language became our external library. Two hundred thousand years of experiences and knowledge, you know, is encoded in our language for our children to find when they grow up. And so while they are learning the language, they are absorbing the wisdom of civilizations.
And you know, these one-syllable words from two hundred thousand years ago, they have become multi-syllable and they have become a microcosm of compressed intelligence. You know, think about, when you have to explain to children these beautiful words like mercy, grace, evolution, gravity, equilibrium. But language is so much more than just words. That would have been simple. Language is the multi-modal internal representation of an external world, based on our sensory input and our spatial temporal experiences that is necessary for cognitive higher functions like planning, predicting and multi-step reasoning.
And that world model, because that is a world model and it is inside language, that world model we badly need for AGI, because otherwise it won't work. So instead of -- and this is like the big turnaround in theoretical linguistics -- instead of trying to study how we use language, we are now studying how language uses us to produce that world model, so that we can extract it and reproduce it in silicon. And that's a hard problem.
But in 2017, we made a formidable breakthrough. And when I say we, I mean a team at Google Brain that published "Attention is All You Need." They proposed a new architecture, Transformers. Transformers are, let's say, AI models that are very good at understanding and generating human-like text based on ... probabilistic patterns.
OpenAI started scaling it, put some rails on it and made a nice interface. So we tried it out, I think it was 2019, and we were impressed. It was the first time in 30 years that we really had an idea of what a world model could be. It had the beginnings of a world model.
Now why, and you know, why is it so hard to extract that world model out of language? Well, first of all, language is bigger than the universe because it is a discrete infinity. A discrete infinity means it has 26 letters, but the combinations of that are infinite. So you must look at it like a cosmic web of meaning, where words and sentences are interlinked into several dimensions across time, emotion and context. And then there are two confusing variables, ambiguities, words that have several meanings and long-term dependencies. That's the relationship between distant words.
So for 30, 40 years, we were like early astronomers, you know looking through a telescope and trying to gaze at a star and thereby missing the constellations and, and the galaxies that gave that star the structure and its meaning. But now, with Transformers and a parallel processing power, we can see as much as we can from the universe. We have the possibility to see that celestial dance between ambiguities and non-linearities and long-term dependencies, because we are sitting, you know ... we have have a front-row seat in the planetarium of human thought.
So what's next? Well, AGI is what everyone is talking about. Many parties, many people have opinions. Many people even have dates connected to it. I can give you a linguistic view. How how a linguist would do it.
First of all, everything that comes out of the system, you feed back in. And you do that with some human oversight so that you anonymize, you know, like, where it needs to be anonymized. Once you do that, you create a continuous cycle whereby the AI and human language are continuously negotiating meaning. This allows the AI to recursively self-learn.
Now, at some point, you know, there is a critical mass of human feedback. You know, it's enough. And then a tipping point has to arrive. Or let's say a phase transition has to arrive where the AI becomes self-sufficient in its learning. It doesn't need that much human information anymore. It will still always need it because we are evolving, but it will also evolve. So it will become autonomous and self-sufficient.
When it becomes autonomous and self-sufficient, that AI becomes unavoidable. You know, in French we have a word, "incontournable." You cannot really translate it in English, but you have to be there or you fall behind.
So more people will start to use it. And at that point, that AI will annex more domains with transfer learning. And that AI will become so complex, just like any adaptive complex system, it will become unpredictable. And then suddenly we humans will say, "Hey, it's human" and we will recognize it, you know? And we will probably call it alternative general intelligence, because you see, in the end, it comes to us. AGI is a philosophical question.
Now why do I think it is possible? Well ... We are the only animals whose language is complex enough to imagine the future and to create sophisticated tools to get there. We're pushers of boundaries, so it will happen.
Thank you.
(Applause)