Max Tegmark: How to keep AI under control

Five years ago, I stood on the TED stage and warned about the dangers of superintelligence. I was wrong. It went even worse than I thought.

Há cinco anos, subi ao palco TED e alertei para os perigos da superinteligência. Eu estava enganado. Correu ainda pior do que eu pensava.

(Laughter)

(Risos)

I never thought governments would let AI companies get this far without any meaningful regulation. And the progress of AI went even faster than I predicted. Look, I showed this abstract landscape of tasks where the elevation represented how hard it was for AI to do each task at human level. And the sea level represented what AI could be back then. And boy or boy, has the sea been rising fast ever since. But a lot of these tasks have already gone blub blub blub blub blub blub. And the water is on track to submerge all land, matching human intelligence at all cognitive tasks.

Nunca pensei que os governos deixassem as empresas de IA chegarem tão longe sem qualquer regulamentação significativa. O progresso da IA foi ainda mais rápido do que eu previa. Mostrei este panorama abstrato de tarefas em que a elevação representava o quão difícil era a IA fazer cada tarefa como um ser humano. O nível do mar representava o que a IA podia ser naquela época. O mar tem vindo a subir rapidamente desde então. Mas muitas destas tarefas já foram blub blub blub blub blub blub. A água está em vias de submergir toda a terra, equiparando-se à inteligência humana em todas as tarefas cognitivas.

This is a definition of artificial general intelligence, AGI, which is the stated goal of companies like OpenAI, Google DeepMind and Anthropic. And these companies are also trying to build superintelligence, leaving human intelligence far behind. And many think it'll only be a few years, maybe, from AGI to superintelligence.

Esta é uma definição de inteligência geral artificial, IGA, que é o objetivo declarado de empresas como a OpenAI, a Google DeepMind e a Anthropic. Estas empresas também estão a tentar criar uma superinteligência que deixará a inteligência humana muito para trás. Muita gente pensa que serão só uns anos, talvez, da IGA até à superinteligência.

So when are we going to get AGI? Well, until recently, most AI researchers thought it was at least decades away. And now Microsoft is saying, "Oh, it's almost here." We're seeing sparks of AGI in ChatGPT-4, and the Metaculus betting site is showing the time left to AGI plummeting from 20 years away to three years away in the last 18 months. And leading industry people are now predicting that we have maybe two or three years left until we get outsmarted. So you better stop talking about AGI as a long-term risk, or someone might call you a dinosaur stuck in the past.

Então, quando é que vamos ter uma IGA? Até há pouco tempo, a maioria dos investigadores da IA pensava que faltavam pelo menos umas décadas. Mas agora a Microsoft diz: “Oh, está quase aqui”. Estamos a ver lampejos de IGA no ChatGPT-4, e a página de previsões Metaculus mostra o tempo que falta para a IGA cair de 20 anos para 3 anos, nos últimos 18 meses. As principais pessoas da indústria estão agora a prever que ainda temos dois ou três anos até sermos ultrapassados. É melhor deixarem de falar da IGA como um risco a longo prazo, senão podem vir a chamar-vos dinossauros presos no passado.

It's really remarkable how AI has progressed recently. Not long ago, robots moved like this.

É realmente espantoso como a IA progrediu recentemente. Ainda há pouco tempo, os robôs moviam-se assim.

(Music)

(Música)

Now they can dance.

Agora já sabem dançar.

(Music)

(Música)

Just last year, Midjourney produced this image. This year, the exact same prompt produces this. Deepfakes are getting really convincing.

Ainda no ano passado, a Midjourney produziu esta imagem. Este ano, exatamente o mesmo pedido produz isto. Os deepfakes estão a tornar-se muito convincentes.

(Video) Deepfake Tom Cruise: I’m going to show you some magic.

(Vídeo)

Deepfake Tom Cruise: Vou mostrar-vos uma magia.

It's the real thing.

Isto é a coisa real.

(Laughs)

Quer dizer...

I mean ... It's all ... the real ... thing.

É tudo... a verdadeira coisa. Max Tegmark: Será mesmo?

Max Tegmark: Or is it?

And Yoshua Bengio now argues that large language models have mastered language and knowledge to the point that they pass the Turing test. I know some skeptics are saying, "Nah, they're just overhyped stochastic parrots that lack a model of the world," but they clearly have a representation of the world. In fact, we recently found that Llama-2 even has a literal map of the world in it. And AI also builds geometric representations of more abstract concepts like what it thinks is true and false.

Yoshua Bengio defende agora que os grandes modelos linguísticos dominaram a linguagem e o conhecimento, ao ponto de passarem no teste de Turing. Sei que alguns céticos dizem: “Não, são apenas papagaios aleatórios demasiado publicitados “que não têm um modelo do mundo”, mas eles claramente têm uma representação do mundo. Na verdade, descobrimos recentemente que o Llama-2 até contém um mapa literal do mundo. E a IA também cria representações geométricas de conceitos mais abstratos como o que pensa ser verdadeiro e falso.

So what's going to happen if we get AGI and superintelligence? If you only remember one thing from my talk, let it be this. AI godfather, Alan Turing predicted that the default outcome is the machines take control. The machines take control. I know this sounds like science fiction, but, you know, having AI as smart as GPT-4 also sounded like science fiction not long ago. And if you think of AI, if you think of superintelligence in particular, as just another technology, like electricity, you're probably not very worried. But you see, Turing thinks of superintelligence more like a new species. Think of it, we are building creepy, super capable, amoral psychopaths that don't sleep and think much faster than us, can make copies of themselves and have nothing human about them at all. So what could possibly go wrong?

Então, o que acontecerá se obtivermos IGA e superinteligência? Se só se lembrarem de uma coisa de toda a minha palestra, que seja isto: O padrinho da IA, Alan Turing, previu que o resultado por defeito é que as máquinas assumam o controlo. As máquinas assumem o controlo. Eu sei que isto parece ficção científica, mas ter uma IA tão esperta como o GPT-4 também pareceu ser ficção científica há pouco tempo. Se pensarem na IA, se pensarem na superinteligência, em particular, apenas como mais uma tecnologia, como a eletricidade, provavelmente não estarão muito preocupados. Mas Turing pensa na superinteligência mais como uma nova espécie. Pensem nisso, estamos a criar psicopatas assustadores, super capazes, psicopatas amorais que não dormem e pensam muito mais depressa do que nós, que fazem cópias de si mesmos e não contêm nada de humano. Então, o que é que pode correr mal?

(Laughter)

(Risos)

And it's not just Turing. OpenAI CEO Sam Altman, who gave us ChatGPT, recently warned that it could be "lights out for all of us." Anthropic CEO, Dario Amodei, even put a number on this risk: 10-25 percent. And it's not just them. Human extinction from AI went mainstream in May when all the AGI CEOs and who's who of AI researchers came on and warned about it. And last month, even the number one of the European Union warned about human extinction by AI.

E não é só Turing. O CEO da OpenAI, Sam Altman, que nos deu o ChatGPT, avisou recentemente que podem ser “luzes apagadas para todos nós”. O diretor da Anthropic, Dario Amodei, até colocou um número nesse risco: 10 a 25%. E não são só eles. A extinção humana devido à IA tornou-se popular em maio quando todos os CEO da IGA e quem é quem dos investigadores da IA apareceram a alertar sobre isto. E no mês passado, até o número um da União Europeia alertou para a extinção humana devido à IA.

So let me summarize everything I've said so far in just one slide of cat memes. Three years ago, people were saying it's inevitable, superintelligence, it'll be fine, it's decades away. Last year it was more like, It's inevitable, it'll be fine. Now it's more like, It's inevitable.

Vou resumir tudo o que disse até agora apenas num slide de memes de gatos. Há três anos, as pessoas diziam: “A superinteligência é inevitável, vai tudo correr bem. “estamos a décadas de distância.” No ano passado foi mais tipo: “É inevitável, vai correr bem.” Agora é mais assim: “É inevitável.”

(Laughter)

(Risos)

But let's take a deep breath and try to raise our spirits and cheer ourselves up, because the rest of my talk is going to be about the good news, that it's not inevitable, and we can absolutely do better, alright?

Mas vamos respirar fundo e tentar elevar o espírito e animar-nos, porque o resto da minha palestra vai ser sobre as boas notícias, o que não é inevitável, e o que podemos fazer melhor, ok?

(Applause)

(Aplausos)

So ... The real problem is that we lack a convincing plan for AI safety. People are working hard on evals looking for risky AI behavior, and that's good, but clearly not good enough. They're basically training AI to not say bad things rather than not do bad things. Moreover, evals and debugging are really just necessary, not sufficient, conditions for safety. In other words, they can prove the presence of risk, not the absence of risk. So let's up our game, alright? Try to see how we can make provably safe AI that we can control.

Então... O verdadeiro problema é que nos falta um plano convincente para uma IA segura. As pessoas estão a trabalhar arduamente em avaliações que procuram comportamentos de risco da IA, e isso é bom, mas claramente não é suficiente. Basicamente, estão a treinar a IA para não dizer coisas más em vez de não fazer coisas más. Além disso, as avaliações e a depuração são condições necessárias mas não suficientes para a segurança. Por outras palavras, podem comprovar a presença de risco, mas não a ausência de risco. Então, vamos melhorar o nosso jogo, está bem? Tentemos ver como podemos criar uma IA comprovadamente segura que possamos controlar.

Guardrails try to physically limit harm. But if your adversary is superintelligence or a human using superintelligence against you, right, trying is just not enough. You need to succeed. Harm needs to be impossible. So we need provably safe systems. Provable, not in the weak sense of convincing some judge, but in the strong sense of there being something that's impossible according to the laws of physics. Because no matter how smart an AI is, it can't violate the laws of physics and do what's provably impossible. Steve Omohundro and I wrote a paper about this, and we're optimistic that this vision can really work. So let me tell you a little bit about how.

As barreiras de proteção tentam limitar fisicamente os danos. Mas se o nosso adversário é uma superinteligência ou um ser humano a usar uma superinteligência contra nós, tentar não basta. É preciso ter êxito. Os danos têm de ser impossíveis. Por isso, precisamos de sistemas comprovadamente seguros. Comprovado, não no sentido fraco de convencer algum juiz, mas no sentido forte de haver algo que seja impossível segundo as leis da física. Porque, por mais inteligente que seja uma IA, não pode violar as leis da física e fazer o que se prova ser impossível. Steve Omohundro e eu escrevemos um artigo sobre isto, e estamos otimistas de que esta visão pode realmente funcionar. Por isso, vou falar-vos um pouco sobre como será.

There's a venerable field called formal verification, which proves stuff about code. And I'm optimistic that AI will revolutionize automatic proving business and also revolutionize program synthesis, the ability to automatically write really good code. So here is how our vision works. You, the human, write a specification that your AI tool must obey, that it's impossible to log in to your laptop without the correct password, or that a DNA printer cannot synthesize dangerous viruses. Then a very powerful AI creates both your AI tool and a proof that your tool meets your spec. Machine learning is uniquely good at learning algorithms, but once the algorithm has been learned, you can re-implement it in a different computational architecture that's easier to verify.

Há uma área venerável chamada verificação formal, que prova coisas sobre código. Eu estou otimista de que a IA vai revolucionar o negócio das provas automáticas e também revolucionar a síntese de programas, a capacidade de escrever automaticamente um código mesmo bom. Eis como funciona a nossa visão. Vocês, seres humanos, escrevem uma especificação a que a ferramenta da IA tem de obedecer, que é impossível entrar no nosso portátil sem a palavra-passe correta, ou que uma impressora de ADN não pode sintetizar vírus perigosos. Então, uma IA muito avançada cria uma ferramenta de IA e uma prova de que a vossa ferramenta obedece às vossas especificações. A aprendizagem automática é especialmente boa na aprendizagem de algoritmos, mas, depois de aprendido o algoritmo, podemos reimplementá-lo numa arquitetura computacional diferente que seja mais fácil de verificar.

Now you might worry, how on earth am I going to understand this powerful AI and the powerful AI tool it built and the proof, if they're all too complicated for any human to grasp? Here is the really great news. You don't have to understand any of that stuff, because it's much easier to verify a proof than to discover it. So you only have to understand or trust your proof-checking code, which could be just a few hundred lines long. And Steve and I envision that such proof checkers get built into all our compute hardware, so it just becomes impossible to run very unsafe code.

Agora vocês podem preocupar-se. “Como é que vou entender esta poderosa inteligência artificial “e a poderosa ferramenta de IA que ela criou, e a prova, “se são demasiado complicadas para qualquer ser humano entender?” Estas são as boas notícias. Não é preciso entender nada disso, porque é muito mais fácil verificar uma prova do que descobri-la. Só temos de entender ou confiar no nosso código de comprovação, que pode ter apenas poucas centenas de linhas. E o Steve e eu imaginamos que essas provas são incorporadas em todo o nosso hardware computacional, e por isso torna-se impossível executar um código muito inseguro.

What if the AI, though, isn't able to write that AI tool for you? Then there's another possibility. You train an AI to first just learn to do what you want and then you use a different AI to extract out the learned algorithm and knowledge for you, like an AI neuroscientist. This is in the spirit of the field of mechanistic interpretability, which is making really impressive rapid progress. Provably safe systems are clearly not impossible.

E se a IA, no entanto, não conseguir escrever essa ferramenta de IA para nós? Então, há outra possibilidade. Primeiro treinamos uma IA para aprender a fazer o que queremos e depois usamos uma IA diferente para extrair o algoritmo aprendido e o conhecimento para nós, como um neurocientista da IA. Isto está no espírito do campo da interpretabilidade mecanicista, que está a fazer progressos rápidos verdadeiramente impressionantes. Sistemas comprovadamente seguros não são impossíveis.

Let's look at a simple example of where we first machine-learn an algorithm from data and then distill it out in the form of code that provably meets spec, OK? Let’s do it with an algorithm that you probably learned in first grade, addition, where you loop over the digits from right to left, and sometimes you do a carry. We'll do it in binary, as if you were counting on two fingers instead of ten. And we first train a recurrent neural network, never mind the details, to nail the task. So now you have this algorithm that you don't understand how it works in a black box defined by a bunch of tables of numbers that we, in nerd speak, call parameters. Then we use an AI tool we built to automatically distill out from this the learned algorithm in the form of a Python program. And then we use the formal verification tool known as Dafny to prove that this program correctly adds up any numbers, not just the numbers that were in your training data.

Vejamos um exemplo simples. Primeiro, usamos a aprendizagem automática para um algoritmo a partir de dados e depois destilamo-lo sob a forma de código que, comprovadamente, cumpre as especificações. Vamos fazer isso com um algoritmo que, provavelmente, aprenderam no primeiro ano, a adição, em que somamos os dígitos da direita para a esquerda e, por vezes, fazemos um transporte. Vamos fazê-lo em binário, como se contassem com dois dedos em vez de dez. Primeiro treinamos uma rede neuronal recorrente, sem nos preocuparmos com os pormenores, para realizar a tarefa. Agora temos um algoritmo que não percebemos como funciona numa caixa negra definida por uma série de tabelas de números a que nós, em linguagem nerd, chamamos parâmetros. Depois usamos uma ferramenta de IA que criámos para extrair automaticamente o algoritmo aprendido sob a forma de um programa Python. Depois usamos a ferramenta de verificação formal conhecida por Daphne para provar que este programa adiciona corretamente quaisquer números, e não apenas os números que estavam nos dados da formação.

So in summary, provably safe AI, I'm convinced is possible, but it's going to take time and work. And in the meantime, let's remember that all the AI benefits that most people are excited about actually don't require superintelligence. We can have a long and amazing future with AI.

Resumindo, estou convencido de que é possível uma IA comprovadamente segura, mas vai exigir tempo e trabalho. Entretanto, não nos esqueçamos que todos os benefícios da IA com que a maioria das pessoas estão entusiasmadas não exigem uma superinteligência. Podemos ter um futuro longo e fantástico com a IA.

So let's not pause AI. Let's just pause the reckless race to superintelligence. Let's stop obsessively training ever-larger models that we don't understand. Let's heed the warning from ancient Greece and not get hubris, like in the story of Icarus. Because artificial intelligence is giving us incredible intellectual wings with which we can do things beyond our wildest dreams if we stop obsessively trying to fly to the sun.

Por isso, não suspendamos a IA. Suspendamos apenas a corrida imprudente à superinteligência. Deixemos de treinar obsessivamente modelos cada vez maiores que não compreendemos. Oiçamos o aviso da Grécia antiga e não usemos da arrogância, como na história de Ícaro. Porque a inteligência artificial está a dar-nos asas intelectuais incríveis com as quais podemos fazer coisas para além dos nossos sonhos mais loucos, se deixarmos de tentar voar obsessivamente até ao sol.

Thank you.

Obrigado.

(Applause)

(Aplausos)

Five years ago, I stood on the TED stage and warned about the dangers of superintelligence. I was wrong. It went even worse than I thought.

Há cinco anos, subi ao palco TED e alertei para os perigos da superinteligência. Eu estava enganado. Correu ainda pior do que eu pensava.

(Laughter)

(Risos)

It's really remarkable how AI has progressed recently. Not long ago, robots moved like this.

É realmente espantoso como a IA progrediu recentemente. Ainda há pouco tempo, os robôs moviam-se assim.

(Music)

(Música)

Now they can dance.

Agora já sabem dançar.

(Music)

(Música)

Just last year, Midjourney produced this image. This year, the exact same prompt produces this. Deepfakes are getting really convincing.

Ainda no ano passado, a Midjourney produziu esta imagem. Este ano, exatamente o mesmo pedido produz isto. Os deepfakes estão a tornar-se muito convincentes.

(Video) Deepfake Tom Cruise: I’m going to show you some magic.

(Vídeo)

Deepfake Tom Cruise: Vou mostrar-vos uma magia.

It's the real thing.

Isto é a coisa real.

(Laughs)

Quer dizer...

I mean ... It's all ... the real ... thing.

É tudo... a verdadeira coisa. Max Tegmark: Será mesmo?

Max Tegmark: Or is it?

(Laughter)

(Risos)

(Laughter)

(Risos)

Mas vamos respirar fundo e tentar elevar o espírito e animar-nos, porque o resto da minha palestra vai ser sobre as boas notícias, o que não é inevitável, e o que podemos fazer melhor, ok?

(Applause)

(Aplausos)

Thank you.

Obrigado.

(Applause)

(Aplausos)

Max Tegmark: How to keep AI under control

Max Tegmark: How to keep AI under control

Related talks

Stephen Wolfram: How to think computationally about AI, the universe and everything

Max Tegmark: How to get empowered, not overpowered, by AI

Nita Farahany: Your right to mental privacy in the age of brain-sensing tech

Tom Gruber: How AI can enhance our memory, work and social lives

Kevin Kelly: How AI can bring on a second Industrial Revolution

Kai-Fu Lee: How AI can save our humanity

Related talks

Stephen Wolfram: How to think computationally about AI, the universe and everything

Max Tegmark: How to get empowered, not overpowered, by AI

Nita Farahany: Your right to mental privacy in the age of brain-sensing tech

Tom Gruber: How AI can enhance our memory, work and social lives

Kevin Kelly: How AI can bring on a second Industrial Revolution

Kai-Fu Lee: How AI can save our humanity