Max Tegmark: How to keep AI under control

Five years ago, I stood on the TED stage and warned about the dangers of superintelligence. I was wrong. It went even worse than I thought.

5년 전 TED 강연에 나와 초지능의 위험성을 경고했었습니다. 제가 틀렸어요. 생각보다 더 나빠졌습니다.

(Laughter)

(웃음)

I never thought governments would let AI companies get this far without any meaningful regulation. And the progress of AI went even faster than I predicted. Look, I showed this abstract landscape of tasks where the elevation represented how hard it was for AI to do each task at human level. And the sea level represented what AI could be back then. And boy or boy, has the sea been rising fast ever since. But a lot of these tasks have already gone blub blub blub blub blub blub. And the water is on track to submerge all land, matching human intelligence at all cognitive tasks.

정부에서 의미 있는 규제를 세우지 않고 현 상황까지 AI 회사를 방치할지 생각지도 못했고 인공지능은 예상보다 훨씬 빠르게 발전했습니다. 제 가슴에는 작업을 진행하는 개략적인 지형도가 있는데 고도가 높을수록 인공지능이 인간과 같은 수준으로는 처리하기 어렵다는 뜻입니다. 해수면은 당시 인공지능이 얼마나 발전했는지를 나타냅니다. 놀랍게도 해수면은 역대 최고로 빠르게 상승 중이며 이 작업 중 상당수는 이미 해수면 아래로 가라앉았습니다. 땅은 전부 바닷물에 잠길 것으로 예상되며 인공지능은 모든 인지 과제에서 비슷한 수준을 보입니다.

This is a definition of artificial general intelligence, AGI, which is the stated goal of companies like OpenAI, Google DeepMind and Anthropic. And these companies are also trying to build superintelligence, leaving human intelligence far behind. And many think it'll only be a few years, maybe, from AGI to superintelligence.

이것이 범용 인공지능인 AGI의 정의로 OpenAI, Google DeepMind, Anthropic 등 기업에서는 범용 인공지능을 목표로 내세우고 있습니다. 해당 기업들은 초지능을 구축하기 위해 노력하고 있으며 초지능에 비하면 인간의 지능은 한참 뒤쳐져 있죠. 많은 사람들이 AGI가 초지능으로 금방 발전할 것이라고 생각합니다.

So when are we going to get AGI? Well, until recently, most AI researchers thought it was at least decades away. And now Microsoft is saying, "Oh, it's almost here." We're seeing sparks of AGI in ChatGPT-4, and the Metaculus betting site is showing the time left to AGI plummeting from 20 years away to three years away in the last 18 months. And leading industry people are now predicting that we have maybe two or three years left until we get outsmarted. So you better stop talking about AGI as a long-term risk, or someone might call you a dinosaur stuck in the past.

그렇다면 언제쯤 AGI를 이용할 수 있을까요? 최근까지만 해도 AI 연구자 대부분이 적어도 수십 년 후라고 생각했습니다. Microsoft는 AGI의 시대가 다가왔다고 이야기합니다. 우리는 ChatGPT-4에 AGI를 적극 활용하는 현실을 목격하고 있으며 앞으로를 예측하는 사이트인 Metaculus에 따르면 AGI가 도래하는 데 걸리는 시간이 지난 18개월만에 20년에서 3년으로 급감했습니다. 업계를 선도하는 관계자들은 인공지능이 인간을 앞지를 때까지 약 2~3년 남았다고 예측하고 있습니다. 따라서 AGI를 장기적 위험으로 여기지 않는 것이 좋으며 그러지 않으면 과거에 갇힌 공룡이라는 말을 듣겠죠.

It's really remarkable how AI has progressed recently. Not long ago, robots moved like this.

최근 AI가 얼마나 발전했는지 생각하면 정말 놀랍습니다. 얼마 전까지 로봇은 이렇게 움직였습니다.

(Music)

(음악)

Now they can dance.

이제는 춤을 춥니다.

(Music)

(음악)

Just last year, Midjourney produced this image. This year, the exact same prompt produces this. Deepfakes are getting really convincing.

바로 작년에 Midjourney에서 이 이미지를 만들었어요. 올해에는 똑같은 프롬프트로 이 이미지를 만들었죠. 딥페이크는 점점 진짜 같아지고있습니다.

(Video) Deepfake Tom Cruise: I’m going to show you some magic.

(비디오) 톰 크루즈 딥페이크: 마법을 보여드릴게요.

It's the real thing.

진짜 동전입니다.

(Laughs)

(웃음)

I mean ... It's all ... the real ... thing.

그러니까... 모든 것이... 진짜... - 현실입니다. - 막스 테그마크: 그럴까요?

Max Tegmark: Or is it?

And Yoshua Bengio now argues that large language models have mastered language and knowledge to the point that they pass the Turing test. I know some skeptics are saying, "Nah, they're just overhyped stochastic parrots that lack a model of the world," but they clearly have a representation of the world. In fact, we recently found that Llama-2 even has a literal map of the world in it. And AI also builds geometric representations of more abstract concepts like what it thinks is true and false.

요슈아 벤지오의 주장에 따르면 이제 대규모 언어 모델은 언어와 지식을 숙달했으며 튜링 테스트를 통과할 정도라고 합니다. 일부 회의론자는 확률을 과장해서 말하는 앵무새로 치부하며 현실의 모델이 부족하다고 말하지만 현실을 명확히 투영한 모델이 갖춰져 있습니다. 최근 Llama-2는 말 그대로 세계 지도를 갖췄다는 것을 알게 되었죠. 또한 AI는 참과 거짓의 정의에 대해 사고하는 등 더욱 추상적인 개념을 기하학적 구조로 구축할 수 있습니다.

So what's going to happen if we get AGI and superintelligence? If you only remember one thing from my talk, let it be this. AI godfather, Alan Turing predicted that the default outcome is the machines take control. The machines take control. I know this sounds like science fiction, but, you know, having AI as smart as GPT-4 also sounded like science fiction not long ago. And if you think of AI, if you think of superintelligence in particular, as just another technology, like electricity, you're probably not very worried. But you see, Turing thinks of superintelligence more like a new species. Think of it, we are building creepy, super capable, amoral psychopaths that don't sleep and think much faster than us, can make copies of themselves and have nothing human about them at all. So what could possibly go wrong?

그럼 AGI와 초지능을 이용하면 어떻게 될까요? 제 강연에서 딱 한 가지만 기억해야 한다면 이걸 기억하세요. 인공지능의 대부인 앨런 튜링은 기본 결과는 기계가 통제할 것이라고 예측했습니다. 기계가 주도권을 잡는 것이죠. SF 같은 이야기지만 GPT-4처럼 똑똑한 AI도 얼마 전 까지는 SF 같은 이야기였습니다. 인공지능에 대해서 초지능에 대해서, 특히 초지능을 전기와 같은 또 다른 기술이라고 생각한다면 크게 걱정하지 않으실 겁니다. 하지만 튜링은 초지능을 새로운 종과 비슷하게 생각했습니다. 생각해 보면 인간은 소름 끼치게 능력이 뛰어나며 도덕성 없는 사이코패스를 만들고 있으며 잠도 자지 않고 인간보다 사고력도 좋은데 스스로를 복제할 수 있고 인간성이 전혀 없는 것들입니다. 그럼 뭐가 잘못될 수 있을까요?

(Laughter)

(웃음)

And it's not just Turing. OpenAI CEO Sam Altman, who gave us ChatGPT, recently warned that it could be "lights out for all of us." Anthropic CEO, Dario Amodei, even put a number on this risk: 10-25 percent. And it's not just them. Human extinction from AI went mainstream in May when all the AGI CEOs and who's who of AI researchers came on and warned about it. And last month, even the number one of the European Union warned about human extinction by AI.

튜링뿐만이 아닙니다. ChatGPT를 만든 OpenAI의 CEO 샘 알트만은 최근 ChatGPT가 인간에게 위협이 될 수 있다고 경고했습니다. Anthropic의 CEO인 다리오 아모데이는 수치까지 제시하면서 10~25%라고 이야기했죠. 그들뿐만이 아닙니다. 5월에는 AGI CEO와 AI 연구원이 모두 나서서 AI로 인한 인간 멸종에 대해 경고하면서 주류 의견으로 떠올랐습니다. 지난달에는 유럽 연합 집행위원장도 인공지능에 의한 인간 멸종에 대해 경고했습니다.

So let me summarize everything I've said so far in just one slide of cat memes. Three years ago, people were saying it's inevitable, superintelligence, it'll be fine, it's decades away. Last year it was more like, It's inevitable, it'll be fine. Now it's more like, It's inevitable.

지금까지 말씀드린 모든 내용을 고양이 밈을 활용한 슬라이드로 요약하겠습니다. 3년 전 사람들은 초지능을 피할 수 없는 일이라고 말했습니다. 별 문제 없을 것이며 수십 년은 남았다고 했죠. 작년에는 이랬습니다. 어쩔 수 없어요, 괜찮을 거예요. 지금은 이렇게 말합니다. 필연적인 일이예요.

(Laughter)

(웃음)

But let's take a deep breath and try to raise our spirits and cheer ourselves up, because the rest of my talk is going to be about the good news, that it's not inevitable, and we can absolutely do better, alright?

하지만 숨을 깊이 들이쉬고 기운을 북돋워 보죠. 이제 희소식을 이야기해 드릴 텐데요. 피할 수 없는 일은 아니며 더 잘 헤쳐나갈 수 있습니다, 알았죠?

(Applause)

(박수)

So ... The real problem is that we lack a convincing plan for AI safety. People are working hard on evals looking for risky AI behavior, and that's good, but clearly not good enough. They're basically training AI to not say bad things rather than not do bad things. Moreover, evals and debugging are really just necessary, not sufficient, conditions for safety. In other words, they can prove the presence of risk, not the absence of risk. So let's up our game, alright? Try to see how we can make provably safe AI that we can control.

그러니까... 진짜 문제는 AI 안전에 대한 확실한 계획이 없다는 것입니다. 사람들은 위험한 AI 행동을 식별하는 평가에 노력을 기울기이고 있으며, 긍정적인 자세입니다. 하지만 충분치 못한 것은 분명하죠. AI를 학습시킬 때는 나쁜 일을 하지 않는 것 보다는 나쁜 말을 하지 않도록 학습시킵니다. 게다가 평가와 디버깅은 안전을 위한 필요 조건일 뿐 안전의 충분 조건은 아닙니다. 다시 말하자면 위험의 부재가 아니라 위험의 존재를 증명할 수 있습니다. 더 깊게 들어가볼까요? 제어할 수 있어 안전성이 입증된 AI를

Guardrails try to physically limit harm.

만드는 방법을 확인해 보겠습니다.

But if your adversary is superintelligence or a human using superintelligence against you, right, trying is just not enough. You need to succeed. Harm needs to be impossible. So we need provably safe systems. Provable, not in the weak sense of convincing some judge, but in the strong sense of there being something that's impossible according to the laws of physics. Because no matter how smart an AI is, it can't violate the laws of physics and do what's provably impossible. Steve Omohundro and I wrote a paper about this, and we're optimistic that this vision can really work. So let me tell you a little bit about how.

가드레일로는 물리적 피해를 막으려는 시도를 합니다. 하지만 상대방이 초지능이거나 초지능을 이용해 공격하는 인간의 경우에는 시도만으로는 충분하지 않습니다. 성공해야 합니다. 피해를 줄 수 없어야 합니다. 입증할 수 있는 안전한 시스템이 필요하죠. 입증이란 판사를 설득하는 등 약한 의미가 아닌 물리 법칙에 따라 불가능하다는 강력한 의미를 가리킵니다. 인공지능이 아무리 똑똑해도 물리 법칙을 거슬러 불가능이 증명된 일을 해낼 수는 없죠. 스티브 오모헌드로와 저는 이에 대한 논문을 썼고 비전이 실제로 효과 있을 것이라고 낙관하고 있습니다. 방법에 대해 조금 말씀드리죠.

There's a venerable field called formal verification, which proves stuff about code. And I'm optimistic that AI will revolutionize automatic proving business and also revolutionize program synthesis, the ability to automatically write really good code. So here is how our vision works. You, the human, write a specification that your AI tool must obey, that it's impossible to log in to your laptop without the correct password, or that a DNA printer cannot synthesize dangerous viruses. Then a very powerful AI creates both your AI tool and a proof that your tool meets your spec. Machine learning is uniquely good at learning algorithms, but once the algorithm has been learned, you can re-implement it in a different computational architecture that's easier to verify.

정형 검증이라는 권위 있는 분야에서는 코드에 대한 정보를 증명합니다. 저는 AI를 통해 자동 검증 비즈니스에서 혁명을 일으키고 매우 양질의 코드를 자동 작성하는 프로그램 합성 능력에서도 혁명을 일으킬 것이라고 낙관합니다. 비전의 원리는 이렇습니다. 우리 인간은 사양을 작성해 AI가 반드시 따르도록 합니다. 올바른 암호 없이는 노트북에 로그인할 수 없는 사양이나 DNA 프린터로 위험한 바이러스를 합성할 수 없다는 사양을 작성합니다. 그러면 매우 강력한 AI가 AI 도구를 만들고 도구가 사양을 충족한다는 증거를 모두 만들어냅니다. 머신러닝은 알고리즘 학습에 매우 적합하며 알고리즘을 학습한 후에는 검증하기 쉬운 다른 계산 아키텍처에 다시 구현할 수 있습니다.

Now you might worry, how on earth am I going to understand this powerful AI and the powerful AI tool it built and the proof, if they're all too complicated for any human to grasp? Here is the really great news. You don't have to understand any of that stuff, because it's much easier to verify a proof than to discover it. So you only have to understand or trust your proof-checking code, which could be just a few hundred lines long. And Steve and I envision that such proof checkers get built into all our compute hardware, so it just becomes impossible to run very unsafe code.

‘내가 어떻게 이해할 수 있을까’라는 걱정을 할 수 있습니다. 강력한 AI와 AI가 만든 강력한 AI 도구, 증거가 이해하기에 너무 복잡하다면 어떻게 이해할 수 있을까요 ? 정말 좋은 소식이 있습니다. 어떤 것도 이해할 필요가 없습니다. 증거를 발견하는 것보다 증명하는 것이 훨씬 쉽기 때문이죠. 수백 줄에 불과한 증명 검사 코드를 이해하거나 믿기만 하면 됩니다. 스티브와 저는 이러한 증명 검사기가 모든 컴퓨팅 하드웨어에 내장되어 매우 위험한 코드를 실행할 수 없게 된다고 생각합니다.

What if the AI, though, isn't able to write that AI tool for you? Then there's another possibility. You train an AI to first just learn to do what you want and then you use a different AI to extract out the learned algorithm and knowledge for you, like an AI neuroscientist. This is in the spirit of the field of mechanistic interpretability, which is making really impressive rapid progress. Provably safe systems are clearly not impossible.

하지만 AI가 그 AI 도구를 대신 작성해 줄 수 없다면 어떨까요? 그렇다면 또 다른 가능성도 있습니다. AI에 원하는 것을 먼저 학습시킨 다음 다른 AI를 사용하여 사용할 학습된 알고리즘과 지식을 추출합니다. AI 신경과학자같죠. 정말 놀라울 정도로 빠른 발전을 이루고 있는 기계론적 해석가능성 분야의 핵심 개념이 바로 이것입니다. 입증된 안전한 시스템은 불가능하지 않습니다.

Let's look at a simple example of where we first machine-learn an algorithm from data and then distill it out in the form of code that provably meets spec, OK? Let’s do it with an algorithm that you probably learned in first grade, addition, where you loop over the digits from right to left, and sometimes you do a carry. We'll do it in binary, as if you were counting on two fingers instead of ten. And we first train a recurrent neural network, never mind the details, to nail the task. So now you have this algorithm that you don't understand how it works in a black box defined by a bunch of tables of numbers that we, in nerd speak, call parameters. Then we use an AI tool we built to automatically distill out from this the learned algorithm in the form of a Python program. And then we use the formal verification tool known as Dafny to prove that this program correctly adds up any numbers, not just the numbers that were in your training data.

간단한 예시로 데이터에서 알고리즘을 먼저 머신러닝한 다음 사양을 충족하는 것으로 입증된 코드 형태로 추출하는 간단한 예를 살펴보겠습니다. 1학년 때 배웠을 법한 알고리즘으로 해봅시다. 덧셈은 숫자를 오른쪽에서 왼쪽으로 반복해서 이동하면서 받아올림도 하죠. 2진수의 경우 열 손가락이 아닌 두 손가락으로만 계산합니다. 먼저 순환 신경망을 학습시켜 세부 사항은 신경 쓰지 않고 작업을 제대로 해내도록 합니다. 이제 작동 원리를 알 수 없는 알고리즘이 블랙박스에 생겼으며 블랙박스는 수많은 숫자 테이블로 정의됩니다. 숫자는 전문 용어로 파라미터라고 하죠. 그 다음으로 Python 프로그램 형식으로 학습된 알고리즘을 직접 만든 AI 도구를 사용해 블랙박스에서 자동 추출합니다. 그런 다음 Daphne라는 정형 검증 도구를 사용하여 이 프로그램이 학습 데이터에 있었던 숫자뿐만 아니라 모든 숫자를 똑바로 더하고 있다는 것을 증명합니다.

So in summary, provably safe AI, I'm convinced is possible, but it's going to take time and work. And in the meantime, let's remember that all the AI benefits that most people are excited about actually don't require superintelligence. We can have a long and amazing future with AI.

요약하자면 입증된 안전한 인공지능은 실현 가능하다고 확신합니다. 하지만 시간과 노력이 필요할 것입니다. 한편으로는 기억해야 하는 점으로 대다수가 열광하는 AI의 이점은 전부 실제로는 초지능이 불필요하다는 점입니다. AI와 함께라면 길고 놀라운 미래를 만들 수 있습니다.

So let's not pause AI. Let's just pause the reckless race to superintelligence. Let's stop obsessively training ever-larger models that we don't understand. Let's heed the warning from ancient Greece and not get hubris, like in the story of Icarus. Because artificial intelligence is giving us incredible intellectual wings with which we can do things beyond our wildest dreams if we stop obsessively trying to fly to the sun.

AI에 제동을 걸지 않아야 합니다. 초지능을 향한 무모한 경쟁을 멈춰야 합니다. 인간이 이해하지 못하는 더 큰 모델을 학습시키려는 집착을 버려야 합니다. 고대 그리스 때 말했던 경고에 귀를 기울여 이카루스 이야기처럼 오만함에 빠지지 맙시다. 왜냐하면 인공지능은 인간에게 놀라운 지식의 날개를 달아주고 태양을 향해 날아가려는 집착을 버린다면 꿈도 꾸지 못한 일을 해낼 수 있게 해주니까요.

Thank you.

고맙습니다.

(Applause)

(박수)

Five years ago, I stood on the TED stage and warned about the dangers of superintelligence. I was wrong. It went even worse than I thought.

5년 전 TED 강연에 나와 초지능의 위험성을 경고했었습니다. 제가 틀렸어요. 생각보다 더 나빠졌습니다.

(Laughter)

(웃음)

It's really remarkable how AI has progressed recently. Not long ago, robots moved like this.

최근 AI가 얼마나 발전했는지 생각하면 정말 놀랍습니다. 얼마 전까지 로봇은 이렇게 움직였습니다.

(Music)

(음악)

Now they can dance.

이제는 춤을 춥니다.

(Music)

(음악)

Just last year, Midjourney produced this image. This year, the exact same prompt produces this. Deepfakes are getting really convincing.

바로 작년에 Midjourney에서 이 이미지를 만들었어요. 올해에는 똑같은 프롬프트로 이 이미지를 만들었죠. 딥페이크는 점점 진짜 같아지고있습니다.

(Video) Deepfake Tom Cruise: I’m going to show you some magic.

(비디오) 톰 크루즈 딥페이크: 마법을 보여드릴게요.

It's the real thing.

진짜 동전입니다.

(Laughs)

(웃음)

I mean ... It's all ... the real ... thing.

그러니까... 모든 것이... 진짜... - 현실입니다. - 막스 테그마크: 그럴까요?

Max Tegmark: Or is it?

(Laughter)

(웃음)

(Laughter)

(웃음)

(Applause)

(박수)

Guardrails try to physically limit harm.

만드는 방법을 확인해 보겠습니다.

Thank you.

고맙습니다.

(Applause)

(박수)

Max Tegmark: How to keep AI under control

Max Tegmark: How to keep AI under control

Related talks

Stephen Wolfram: How to think computationally about AI, the universe and everything

Max Tegmark: How to get empowered, not overpowered, by AI

Nita Farahany: Your right to mental privacy in the age of brain-sensing tech

Tom Gruber: How AI can enhance our memory, work and social lives

Kevin Kelly: How AI can bring on a second Industrial Revolution

Kai-Fu Lee: How AI can save our humanity

Related talks

Stephen Wolfram: How to think computationally about AI, the universe and everything

Max Tegmark: How to get empowered, not overpowered, by AI

Nita Farahany: Your right to mental privacy in the age of brain-sensing tech

Tom Gruber: How AI can enhance our memory, work and social lives

Kevin Kelly: How AI can bring on a second Industrial Revolution

Kai-Fu Lee: How AI can save our humanity