Max Tegmark: How to keep AI under control

Five years ago, I stood on the TED stage and warned about the dangers of superintelligence. I was wrong. It went even worse than I thought.

Hace cinco años, subí al escenario de TED y advertí sobre los peligros de la superinteligencia. Estaba equivocado. Salió aún peor de lo que pensaba.

(Laughter)

(Risas)

I never thought governments would let AI companies get this far without any meaningful regulation. And the progress of AI went even faster than I predicted. Look, I showed this abstract landscape of tasks where the elevation represented how hard it was for AI to do each task at human level. And the sea level represented what AI could be back then. And boy or boy, has the sea been rising fast ever since. But a lot of these tasks have already gone blub blub blub blub blub blub. And the water is on track to submerge all land, matching human intelligence at all cognitive tasks.

No pensé que los gobiernos permitirían que las empresas de inteligencia artificial llegaran tan lejos sin regulaciones fijas. Y el progreso de la IA fue incluso más rápido de lo que había previsto. Miren, mostré este panorama abstracto de tareas, en el que la elevación representaba lo difícil que era para la IA realizar cada tarea a nivel humano. El nivel del mar representaba lo que la IA podía ser en aquel momento. Y vaya que el nivel del mar ha estado subiendo rápidamente desde entonces. Muchas de estas tareas ya se han hundido más y más. El agua va en camino a sumergir toda la tierra, igualando la inteligencia humana en todas las tareas cognitivas.

This is a definition of artificial general intelligence, AGI, which is the stated goal of companies like OpenAI, Google DeepMind and Anthropic. And these companies are also trying to build superintelligence, leaving human intelligence far behind. And many think it'll only be a few years, maybe, from AGI to superintelligence.

Esta es una definición de inteligencia general artificial (IGA), que es el objetivo declarado de empresas como OpenAI, Google, DeepMind y Anthropic. Estas empresas también están intentando desarrollar una superinteligencia, dejando por detrás a la inteligencia humana. Y muchos piensan que solo faltan unos pocos años para que la IA pase a la superinteligencia

So when are we going to get AGI? Well, until recently, most AI researchers thought it was at least decades away. And now Microsoft is saying, "Oh, it's almost here." We're seeing sparks of AGI in ChatGPT-4, and the Metaculus betting site is showing the time left to AGI plummeting from 20 years away to three years away in the last 18 months. And leading industry people are now predicting that we have maybe two or three years left until we get outsmarted. So you better stop talking about AGI as a long-term risk, or someone might call you a dinosaur stuck in the past.

Entonces, ¿cuándo vamos a tener la IGA? Bueno, hasta hace poco, la mayoría de los investigadores pensaban que faltaban por lo menos unas décadas. Y ahora Microsoft dice, “ya está por llegar.” En ChatGPT-4 se vislumbran destellos de la IGA, y el sitio de apuestas Metaculus muestra que el tiempo que falta para que la IGA se implemente ha caído en picada, pasando de 20 años a tan solo tres durante los últimos 18 meses. Además, los líderes del sector predicen que nos quedan unos dos o tres años hasta que nos ganen en astucia. Así que más vale que dejen de hablar de la AGI como un riesgo a largo plazo, o parecerán un dinosaurio atrapado en el pasado.

It's really remarkable how AI has progressed recently. Not long ago, robots moved like this.

Es realmente sorprendente cómo ha progresado la IA recientemente. Hace no mucho, los robots se movían así.

(Music)

(Música)

Now they can dance.

Ahora pueden bailar.

(Music)

(Música)

Just last year, Midjourney produced this image. This year, the exact same prompt produces this. Deepfakes are getting really convincing.

El año pasado, Midjourney produjo esta imagen. Este año, el mismo mensaje produce esto. Los Deepfakes se están volviendo realmente convincentes.

(Video) Deepfake Tom Cruise: I’m going to show you some magic.

(Vídeo) Tom Cruise de Deepfake: Te mostraré un poco de magia.

It's the real thing.

Esto es real.

(Laughs)

(Risas)

I mean ... It's all ... the real ... thing.

Es decir... Todo esto es... real.

Max Tegmark: Or is it?

Max Tegmark: ¿lo es?

And Yoshua Bengio now argues that large language models have mastered language and knowledge to the point that they pass the Turing test. I know some skeptics are saying, "Nah, they're just overhyped stochastic parrots that lack a model of the world," but they clearly have a representation of the world. In fact, we recently found that Llama-2 even has a literal map of the world in it. And AI also builds geometric representations of more abstract concepts like what it thinks is true and false.

Yoshua Bengio ahora sostiene que los grandes modelos lingüísticos han dominado el lenguaje y conocimiento hasta el punto de lograr pasar la prueba de Turing. Sé que algunos escépticos dicen: “No son más que loros impredicibles y sobrevalorados que carecen de un modelo del mundo.” Pero es evidente que sí tienen una representación del mundo. De hecho, hace poco se descubrió que Llama-2 contiene un mapa real del mundo. Además, la IA también crea representaciones geométricas de conceptos más abstractos, como lo que cree que es verdadero y falso.

So what's going to happen if we get AGI and superintelligence? If you only remember one thing from my talk, let it be this. AI godfather, Alan Turing predicted that the default outcome is the machines take control. The machines take control. I know this sounds like science fiction, but, you know, having AI as smart as GPT-4 also sounded like science fiction not long ago. And if you think of AI, if you think of superintelligence in particular, as just another technology, like electricity, you're probably not very worried. But you see, Turing thinks of superintelligence more like a new species. Think of it, we are building creepy, super capable, amoral psychopaths that don't sleep and think much faster than us, can make copies of themselves and have nothing human about them at all. So what could possibly go wrong?

Entonces, ¿qué pasará si obtenemos la IA y la superinteligencia? Si solo se llevan una cosa de mi charla, que sea esto: El padrino de la IA, Alan Turing, predijo que el resultado predeterminado es que las máquinas tomen el control. Las máquinas tomarán el control. Sé que suena a ciencia ficción, pero sabrán que tener una IA tan inteligente como la GPT-4 también sonaba a ciencia ficción hace no mucho. Y si piensan en la IA, si piensan en la superinteligencia como solo una tecnología más, así como la electricidad, probablemente no estén muy preocupados. Pero verán, Turing piensa en la superinteligencia más como una nueva especie. Piénsenlo. Estamos creando psicópatas espeluznantes, supercapaces y amorales que no duermen y piensan mucho más rápido que nosotros, pueden hacer copias de sí mismos y no tienen nada de humano. Así que, ¿qué podría salir mal?

(Laughter)

(Risas)

And it's not just Turing. OpenAI CEO Sam Altman, who gave us ChatGPT, recently warned that it could be "lights out for all of us." Anthropic CEO, Dario Amodei, even put a number on this risk: 10-25 percent. And it's not just them. Human extinction from AI went mainstream in May when all the AGI CEOs and who's who of AI researchers came on and warned about it. And last month, even the number one of the European Union warned about human extinction by AI.

Y no es solo Turing. El director ejecutivo de OpenAI, Sam Altman, quien nos regaló ChatGPT, advirtió recientemente que “podría ser el fin de todos”. El CEO de Anthropic, Darío Amodei, incluso calculó este riesgo con una cifra: del 10 al 25 por ciento. Y no son solo ellos. La extinción humana a causa de la IA se generalizó en mayo, cuando todos los directores ejecutivos de IGA y todos los investigadores importantes de la IA nos advirtieron al respecto. Y el mes pasado, incluso la número uno de la Unión Europea advirtió sobre la extinción humana a causa de la IA.

So let me summarize everything I've said so far in just one slide of cat memes. Three years ago, people were saying it's inevitable, superintelligence, it'll be fine, it's decades away. Last year it was more like, It's inevitable, it'll be fine. Now it's more like, It's inevitable.

Permítanme resumir todo lo que he dicho hasta ahora con una sola diapositiva de memes sobre gatos. Hace tres años, la gente decía que era inevitable la superinteligencia, pero que todo iba a estar bien, que aún faltaban décadas. El año pasado fue más bien como: “Es inevitable, todo estará bien”. Y ahora es más bien: “Es inevitable.”

(Laughter)

(Risas)

But let's take a deep breath and try to raise our spirits and cheer ourselves up, because the rest of my talk is going to be about the good news, that it's not inevitable, and we can absolutely do better, alright?

Pero respiremos profundo e intentemos levantarnos el ánimo y ponernos de buen humor, porque el resto de mi charla tratará de las buenas noticias, que no es inevitable, y que podemos hacerlo mejor, ¿de acuerdo?

(Applause)

(Aplausos)

So ... The real problem is that we lack a convincing plan for AI safety. People are working hard on evals looking for risky AI behavior, and that's good, but clearly not good enough. They're basically training AI to not say bad things rather than not do bad things. Moreover, evals and debugging are really just necessary, not sufficient, conditions for safety. In other words, they can prove the presence of risk, not the absence of risk. So let's up our game, alright? Try to see how we can make provably safe AI that we can control.

Así que... El verdadero problema es que carecemos de un plan convincente para la seguridad de la IA. La gente está trabajando arduamente en las evaluaciones que buscan comportamientos riesgosos en materia de IA, y eso es bueno, pero claramente no lo suficientemente bueno. Básicamente, están entrenando a la IA para que no diga cosas malas en lugar de no hacer cosas malas. Además, las evaluaciones y la optimización son solo las condiciones de seguridad necesarias, no las suficientes. En otras palabras, pueden demostrar la presencia de riesgo, no la ausencia de riesgo. Así que mejoremos en eso, ¿de acuerdo? Intenten pensar cómo podemos crear una IA que sea demostrablemente segura y que

Guardrails try to physically limit harm.

podamos controlar.

But if your adversary is superintelligence or a human using superintelligence against you, right, trying is just not enough. You need to succeed. Harm needs to be impossible. So we need provably safe systems. Provable, not in the weak sense of convincing some judge, but in the strong sense of there being something that's impossible according to the laws of physics. Because no matter how smart an AI is, it can't violate the laws of physics and do what's provably impossible. Steve Omohundro and I wrote a paper about this, and we're optimistic that this vision can really work. So let me tell you a little bit about how.

Los barandales intentan limitar físicamente el daño, pero si tu oponente es la superinteligencia, o un humano que usa la superinteligencia en tu contra, entonces intentarlo no es suficiente. Debes tener éxito. El daño tiene que ser imposible. Por lo tanto, necesitamos sistemas que se demuestre que son seguros. Demostrables, no en el sentido débil de convencer a un juez, sino en el sentido de que hay algo que es literalmente imposible según las leyes de la física. Porque, por muy inteligente que sea una IA, no puede violar las leyes de la física y hacer lo que es científicamente imposible. Steve Omohundro y yo escribimos un artículo sobre esto, y nos sentimos optimistas que esta visión podrá funcionar muy bien. Así que permítanme contarles un poco sobre cómo hacerlo.

There's a venerable field called formal verification, which proves stuff about code. And I'm optimistic that AI will revolutionize automatic proving business and also revolutionize program synthesis, the ability to automatically write really good code. So here is how our vision works. You, the human, write a specification that your AI tool must obey, that it's impossible to log in to your laptop without the correct password, or that a DNA printer cannot synthesize dangerous viruses. Then a very powerful AI creates both your AI tool and a proof that your tool meets your spec. Machine learning is uniquely good at learning algorithms, but once the algorithm has been learned, you can re-implement it in a different computational architecture that's easier to verify.

Hay un campo venerable llamado verificación formal, que prueba cosas sobre el código. Y tengo esperanzas de que la IA revolucionará el negocio de las pruebas automáticas, al igual que la síntesis de programas, es decir, la capacidad de escribir automáticamente código realmente bueno. Así es como funciona nuestra visión: Tú, el ser humano, escribes una especificación que tu herramienta de inteligencia artificial debe cumplir, que es imposible iniciar sesión en tu computadora portátil sin la contraseña correcta, o que una impresora de ADN no puede sintetizar virus peligrosos. Entonces, una IA muy poderosa crea tu herramienta de IA y una prueba de que tu herramienta cumple con tus especificaciones. El aprendizaje automático es especialmente bueno para aprenderse algoritmos, pero una vez que se ha aprendido, puedes volver a implementarlo en una arquitectura computacional diferente que sea más fácil de verificar.

Now you might worry, how on earth am I going to understand this powerful AI and the powerful AI tool it built and the proof, if they're all too complicated for any human to grasp? Here is the really great news. You don't have to understand any of that stuff, because it's much easier to verify a proof than to discover it. So you only have to understand or trust your proof-checking code, which could be just a few hundred lines long. And Steve and I envision that such proof checkers get built into all our compute hardware, so it just becomes impossible to run very unsafe code.

Quizás piensen, ¿cómo diablos voy a entender esta poderosa IA, la poderosa herramienta de IA que creó y la prueba, si son demasiado complicadas para que cualquier humano las comprenda? Aquí están las noticias realmente buenas. No tienes que entender nada de eso, porque es mucho más fácil verificar una prueba que descubrirla. Por lo tanto, solo tiene que entender o confiar en su código de verificación, que puede tener solo unos cientos de líneas. Steve y yo prevemos que esos verificadores de pruebas estén integrados en todo nuestro hardware de cómputo, por lo que será imposible ejecutar código muy inseguro.

What if the AI, though, isn't able to write that AI tool for you? Then there's another possibility. You train an AI to first just learn to do what you want and then you use a different AI to extract out the learned algorithm and knowledge for you, like an AI neuroscientist. This is in the spirit of the field of mechanistic interpretability, which is making really impressive rapid progress. Provably safe systems are clearly not impossible.

Sin embargo, ¿qué pasa si la IA no es capaz de escribir esa herramienta por ti? Entonces hay otra posibilidad. Entrenas a una IA para que primero aprenda a hacer lo que quieres y, después, utilizas una IA diferente para extraer el algoritmo y los conocimientos aprendidos, como un neurocientífico de IA. Esto es el fundamento del campo de la interpretabilidad mecanicista, que está progresando a una velocidad realmente impresionante. Evidentemente, los sistemas seguros y demostrables no son imposibles.

Let's look at a simple example of where we first machine-learn an algorithm from data and then distill it out in the form of code that provably meets spec, OK? Let’s do it with an algorithm that you probably learned in first grade, addition, where you loop over the digits from right to left, and sometimes you do a carry. We'll do it in binary, as if you were counting on two fingers instead of ten. And we first train a recurrent neural network, never mind the details, to nail the task. So now you have this algorithm that you don't understand how it works in a black box defined by a bunch of tables of numbers that we, in nerd speak, call parameters. Then we use an AI tool we built to automatically distill out from this the learned algorithm in the form of a Python program. And then we use the formal verification tool known as Dafny to prove that this program correctly adds up any numbers, not just the numbers that were in your training data.

Veamos un ejemplo sencillo en el que primero aprendemos automáticamente un algoritmo a partir de los datos y luego lo destilamos en forma de código que cumple con las especificaciones, ¿sí? Hagámoslo con un algoritmo que probablemente aprendiste en primer grado, la adición, donde recorres los dígitos de derecha a izquierda y a veces sumas el dígito de la décima. Lo haremos en binario, como si contaras con dos dedos en lugar de con diez. Y primero entrenamos una red neuronal recurrente, sin importar los detalles, para completar la tarea. Así que ahora tenemos este algoritmo que no entiendes cómo funciona en una caja negra definida por un grupo de tablas a las que nosotros los nerds llamamos parámetros. Luego utilizamos una herramienta de IA creada para extraer automáticamente de ahí el algoritmo aprendido en forma de un programa de Python. Y luego utilizamos la herramienta de verificación formal conocida como Daphne para demostrar que este programa suma correctamente cualquier número, no solo los números que estaban en tus datos de entrenamiento.

So in summary, provably safe AI, I'm convinced is possible, but it's going to take time and work. And in the meantime, let's remember that all the AI benefits that most people are excited about actually don't require superintelligence. We can have a long and amazing future with AI.

En resumen, estoy convencido de que la IA demostrablemente segura es posible, pero va a llevar tiempo y trabajo. Mientras tanto, recordemos que todos los beneficios de la IA que entusiasman a la mayoría de la gente en realidad no requieren la superinteligencia. Podemos tener un futuro largo y sorprendente con la IA.

So let's not pause AI. Let's just pause the reckless race to superintelligence. Let's stop obsessively training ever-larger models that we don't understand. Let's heed the warning from ancient Greece and not get hubris, like in the story of Icarus. Because artificial intelligence is giving us incredible intellectual wings with which we can do things beyond our wildest dreams if we stop obsessively trying to fly to the sun.

Así que no hagamos una pausa en la IA. Solo hagamos una pausa en la peligrosa carrera hacia la superinteligencia. Dejemos de entrenar obsesivamente a modelos cada vez más grandes que no entendemos. Prestemos atención a la advertencia de la antigua Grecia y no nos dejemos llevar por la arrogancia, como en la historia de Ícaro. Porque la inteligencia artificial nos está dando unas alas intelectuales increíbles con las que podemos hacer cosas más allá de nuestros sueños más descabellados si dejamos de intentar volar obsesivamente hacia el sol.

Thank you.

Gracias.

(Applause)

(Aplausos)

Five years ago, I stood on the TED stage and warned about the dangers of superintelligence. I was wrong. It went even worse than I thought.

Hace cinco años, subí al escenario de TED y advertí sobre los peligros de la superinteligencia. Estaba equivocado. Salió aún peor de lo que pensaba.

(Laughter)

(Risas)

It's really remarkable how AI has progressed recently. Not long ago, robots moved like this.

Es realmente sorprendente cómo ha progresado la IA recientemente. Hace no mucho, los robots se movían así.

(Music)

(Música)

Now they can dance.

Ahora pueden bailar.

(Music)

(Música)

Just last year, Midjourney produced this image. This year, the exact same prompt produces this. Deepfakes are getting really convincing.

El año pasado, Midjourney produjo esta imagen. Este año, el mismo mensaje produce esto. Los Deepfakes se están volviendo realmente convincentes.

(Video) Deepfake Tom Cruise: I’m going to show you some magic.

(Vídeo) Tom Cruise de Deepfake: Te mostraré un poco de magia.

It's the real thing.

Esto es real.

(Laughs)

(Risas)

I mean ... It's all ... the real ... thing.

Es decir... Todo esto es... real.

Max Tegmark: Or is it?

Max Tegmark: ¿lo es?

(Laughter)

(Risas)

(Laughter)

(Risas)

(Applause)

(Aplausos)

Guardrails try to physically limit harm.

podamos controlar.

Thank you.

Gracias.

(Applause)

(Aplausos)

Max Tegmark: How to keep AI under control

Max Tegmark: How to keep AI under control

Related talks

Stephen Wolfram: How to think computationally about AI, the universe and everything

Max Tegmark: How to get empowered, not overpowered, by AI

Nita Farahany: Your right to mental privacy in the age of brain-sensing tech

Tom Gruber: How AI can enhance our memory, work and social lives

Kevin Kelly: How AI can bring on a second Industrial Revolution

Kai-Fu Lee: How AI can save our humanity

Related talks

Stephen Wolfram: How to think computationally about AI, the universe and everything

Max Tegmark: How to get empowered, not overpowered, by AI

Nita Farahany: Your right to mental privacy in the age of brain-sensing tech

Tom Gruber: How AI can enhance our memory, work and social lives

Kevin Kelly: How AI can bring on a second Industrial Revolution

Kai-Fu Lee: How AI can save our humanity