Nick Bostrom: What happens when our computers get smarter than we are?

Ich arbeite mit einigen Mathematikern, Philosophen und Informatikern zusammen. Wir sitzen herum und denken z. B. über die Zukunft der maschinellen Intelligenz nach. Manche Leute denken, einiges davon sei sehr futuristisch, ausgefallen, einfach verrückt. Aber ich sage gern: Betrachten wir den Zustand des modernen Menschen. (Gelächter) Das ist der normale Gang der Dinge.

I work with a bunch of mathematicians, philosophers and computer scientists, and we sit around and think about the future of machine intelligence, among other things. Some people think that some of these things are sort of science fiction-y, far out there, crazy. But I like to say, okay, let's look at the modern human condition. (Laughter) This is the normal way for things to be.

Aber wenn wir darüber nachdenken, ist die menschliche Spezies erst seit Kurzem Gast auf diesem Planeten. Denken Sie darüber nach: Wäre die Erde erst vor einem Jahr erschaffen worden, dann wäre der Mensch erst 10 Minuten alt. Die industrielle Ära hätte vor zwei Sekunden begonnen. Man könnte auch das Welt-BIP der letzten 10.000 Jahre betrachten. Ich habe mir die Mühe gemacht, dies für Sie grafisch darzustellen. Es sieht so aus. (Gelächter) Es ist eine seltsame Form für einen Normalzustand. Ich würde nicht darauf sitzen wollen.

But if we think about it, we are actually recently arrived guests on this planet, the human species. Think about if Earth was created one year ago, the human species, then, would be 10 minutes old. The industrial era started two seconds ago. Another way to look at this is to think of world GDP over the last 10,000 years, I've actually taken the trouble to plot this for you in a graph. It looks like this. (Laughter) It's a curious shape for a normal condition. I sure wouldn't want to sit on it.

(Gelächter) Fragen wir uns: Was ist die Ursache dieser aktuellen Anomalie? Manche Leute würden sagen, dass es Technologie ist. Das ist richtig, Technologie hat sich im Laufe der Zeit angesammelt, und im Moment entwickelt sich die Technologie extrem schnell -- das ist die unmittelbare Ursache, deshalb sind wir derzeit so produktiv. Aber ich denke gerne weiter an die ultimative Ursache.

(Laughter) Let's ask ourselves, what is the cause of this current anomaly? Some people would say it's technology. Now it's true, technology has accumulated through human history, and right now, technology advances extremely rapidly -- that is the proximate cause, that's why we are currently so very productive. But I like to think back further to the ultimate cause.

Schauen Sie sich diese zwei hoch angesehenen Herren an: Wir haben Kanzi -- er hat 200 Begriffe gemeistert, eine unglaubliche Leistung. Und Ed Witten entfesselte die zweite Superstring-Revolution. Ein Blick unter die Haube zeigt das hier: im Grunde das Gleiche. Das eine ist etwas größer, es hat vielleicht auch ein paar Tricks in der Art, wie es verkabelt ist. Diese unsichtbaren Unterschiede können aber nicht allzu kompliziert sein, da es seit unserem letzten gemeinsamen Vorfahren nur 250.000 Generationen gab. Komplizierte Mechanismen brauchen bekanntlich viel Zeit zur Entwicklung. So führen uns einige relativ kleine Änderungen von Kanzi zu Witten, von abgebrochenen Ästen bis hin zu Interkontinentalraketen.

Look at these two highly distinguished gentlemen: We have Kanzi -- he's mastered 200 lexical tokens, an incredible feat. And Ed Witten unleashed the second superstring revolution. If we look under the hood, this is what we find: basically the same thing. One is a little larger, it maybe also has a few tricks in the exact way it's wired. These invisible differences cannot be too complicated, however, because there have only been 250,000 generations since our last common ancestor. We know that complicated mechanisms take a long time to evolve. So a bunch of relatively minor changes take us from Kanzi to Witten, from broken-off tree branches to intercontinental ballistic missiles.

Es ist also ziemlich offensichtlich, dass all unsere Leistungen und alles, was uns interessiert, entscheidend von einigen relativ kleinen Veränderungen abhängt, die den menschlichen Geist ausmachen. Die logische Folge ist natürlich, dass jede weitere Veränderung des Substrats des Denkens

So this then seems pretty obvious that everything we've achieved, and everything we care about, depends crucially on some relatively minor changes that made the human mind. And the corollary, of course, is that any further changes that could significantly change the substrate of thinking

enorme Konsequenzen haben könnte. Einige meiner Kollegen glauben, dass wir kurz vor etwas stehen, was zu einer tiefgreifenden Veränderung dieses Substrats führen könnte, und das ist Maschinen-Superintelligenz. Künstliche Intelligenz hieß, Befehle in eine Box zu stecken. Menschliche Programmierer bastelten mühsam Wissenselemente. Man baute Expertensysteme, die für einige Zwecke nützlich waren, aber sie waren nicht skalierbar. Im Grunde bekam man nur heraus, was man zuvor hineingebaut hatte. Aber seitdem gab es einen Paradigmenwechsel im Bereich der künstlichen Intelligenz.

could have potentially enormous consequences. Some of my colleagues think we're on the verge of something that could cause a profound change in that substrate, and that is machine superintelligence. Artificial intelligence used to be about putting commands in a box. You would have human programmers that would painstakingly handcraft knowledge items. You build up these expert systems, and they were kind of useful for some purposes, but they were very brittle, you couldn't scale them. Basically, you got out only what you put in. But since then, a paradigm shift has taken place in the field of artificial intelligence.

Heute geht alles um maschinelles Lernen. Anstatt Wissensrepräsentationen und -eigenschaften manuell zu erstellen, erstellen wir Algorithmen, die oft aus rohen sensorischen Daten lernen. Genau das Gleiche, was das menschliche Kind tut. Das Ergebnis ist KI, die nicht auf eine Domäne beschränkt ist -- das gleiche System kann lernen, beliebige Sprachpaare zu übersetzen, oder lernen, jedes Computerspiel auf der Atari-Konsole zu spielen. Natürlich hat KI noch nicht annähernd die gleiche universelle Fähigkeit, zu lernen und zu planen wie ein menschliches Wesen. Der Kortex hat noch einige algorithmische Tricks, von denen wir noch nicht wissen, wie wir sie in Maschinen abbilden sollen.

Today, the action is really around machine learning. So rather than handcrafting knowledge representations and features, we create algorithms that learn, often from raw perceptual data. Basically the same thing that the human infant does. The result is A.I. that is not limited to one domain -- the same system can learn to translate between any pairs of languages, or learn to play any computer game on the Atari console. Now of course, A.I. is still nowhere near having the same powerful, cross-domain ability to learn and plan as a human being has. The cortex still has some algorithmic tricks that we don't yet know how to match in machines.

Die Frage ist also: Wie weit sind wir in der Lage, diesen Tricks zu entsprechen? Vor ein paar Jahren machten wir eine Umfrage unter einigen der weltweit führenden KI-Experten, um zu sehen, was sie denken, und eine der Fragen war: "In welchem Jahr sehen Sie eine 50 %-Wahrscheinlichkeit, dass wir maschinelle Intelligenz auf menschlicher Ebene erreicht haben?" Wir definierten hierbei die menschliche Ebene als Fähigkeit, fast jeden Job mindestens so gut wie ein erwachsener Mensch zu können, also die echte menschliche Ebene, nicht nur für einen Spezialbereich. Die mittlere Antwort war 2040 oder 2050, je nachdem, welche Gruppe von Experten wir fragten. Das könnte sehr viel später oder auch früher passieren, niemand weiß das wirklich.

So the question is, how far are we from being able to match those tricks? A couple of years ago, we did a survey of some of the world's leading A.I. experts, to see what they think, and one of the questions we asked was, "By which year do you think there is a 50 percent probability that we will have achieved human-level machine intelligence?" We defined human-level here as the ability to perform almost any job at least as well as an adult human, so real human-level, not just within some limited domain. And the median answer was 2040 or 2050, depending on precisely which group of experts we asked. Now, it could happen much, much later, or sooner, the truth is nobody really knows.

Wir wissen aber, dass die ultimative Grenze für die Informationsverarbeitung in einer Maschine weit jenseits der Grenzen des biologischen Gewebes liegt. Das liegt an der Physik. Ein biologisches Neuron feuert mit etwa 200 Hertz, 200-mal pro Sekunde. Aber sogar ein heutiger Transistor arbeitet mit 1 Gigahertz. Neuronen bewegen sich langsam in Axonen, maximal 100 Meter pro Sekunde. Aber in Computern können sich Signale mit Lichtgeschwindigkeit bewegen. Es gibt auch Größenbeschränkungen, weil ein menschliches Gehirn in einen Schädel passen muss. Aber ein Computer kann so groß wie ein Lagerhaus oder größer sein. Also ruht das Potential für Superintelligenz in der Materie, ähnlich wie die Kraft des Atoms während der Menschheitsgeschichte ruhte und dort geduldig bis 1945 wartete. In diesem Jahrhundert könnten Wissenschaftler lernen, die Kraft der künstlichen Intelligenz zu wecken. Ich denke, wir könnten dann eine Intelligenzexplosion erleben.

What we do know is that the ultimate limit to information processing in a machine substrate lies far outside the limits in biological tissue. This comes down to physics. A biological neuron fires, maybe, at 200 hertz, 200 times a second. But even a present-day transistor operates at the Gigahertz. Neurons propagate slowly in axons, 100 meters per second, tops. But in computers, signals can travel at the speed of light. There are also size limitations, like a human brain has to fit inside a cranium, but a computer can be the size of a warehouse or larger. So the potential for superintelligence lies dormant in matter, much like the power of the atom lay dormant throughout human history, patiently waiting there until 1945. In this century, scientists may learn to awaken the power of artificial intelligence. And I think we might then see an intelligence explosion.

Wenn die meisten Leute darüber nachdenken, was schlau und was dumm ist, haben sie etwa dieses Bild vor Augen: An einem Ende haben wir den Dorftrottel, und weit entfernt am anderen Ende haben wir Ed Witten oder Albert Einstein, oder wer auch immer Ihr Lieblingsguru ist. Aber ich denke, dass vom Standpunkt der künstlichen Intelligenz das wahre Bild wohl eher so aussieht: KI beginnt hier an diesem Punkt, bei null Intelligenz und nach sehr vielen Jahren wirklich harter Arbeit kommen wir vielleicht zur KI auf Mausebene, etwas, das durch ungeordnete Umgebungen navigieren kann wie eine Maus. Dann, nach noch viel mehr Jahren wirklich harter Arbeit und viel Geld, kommen wir vielleicht irgendwann zur KI auf Schimpansen-Ebene. Nach noch mehr Jahren härtester Arbeit kommen wir zur Dorftrottel-KI. Und wenige Augenblicke später sind wir hinter Ed Witten. Der Zug endet nicht in Menschenhausen. Er wird wohl eher einfach durchrauschen.

Now most people, when they think about what is smart and what is dumb, I think have in mind a picture roughly like this. So at one end we have the village idiot, and then far over at the other side we have Ed Witten, or Albert Einstein, or whoever your favorite guru is. But I think that from the point of view of artificial intelligence, the true picture is actually probably more like this: AI starts out at this point here, at zero intelligence, and then, after many, many years of really hard work, maybe eventually we get to mouse-level artificial intelligence, something that can navigate cluttered environments as well as a mouse can. And then, after many, many more years of really hard work, lots of investment, maybe eventually we get to chimpanzee-level artificial intelligence. And then, after even more years of really, really hard work, we get to village idiot artificial intelligence. And a few moments later, we are beyond Ed Witten. The train doesn't stop at Humanville Station. It's likely, rather, to swoosh right by.

Das hat tiefgreifende Auswirkungen, besonders wenn es um Machtfragen geht. Zum Beispiel sind Schimpansen stark -- ein Schimpanse ist pro Kilo etwa doppelt so stark wie ein fitter Mann. Aber das Schicksal von Kanzi und seinen Freunden hängt viel mehr von dem ab, was wir Menschen tun, als von dem, was Schimpansen selbst tun. Sobald es eine Superintelligenz gibt, kann das Schicksal der Menschheit vom Tun der Superintelligenz abhängen. Denken Sie darüber nach: KI ist die letzte Erfindung, die die Menschheit je machen muss. Maschinen sind dann besser im Erfinden als wir und tun es mit digitalen Zeitmaßstäben. Im Grunde bedeutet das eine Komprimierung der Zukunft. Denken Sie an all die verrückten Sachen, von denen Sie sich vorstellen könnten, dass die Menschen sie zu gegebener Zeit entwickelt hätten: Mittel gegen das Altern, Besiedelung des Alls, selbstreplizierende Nanobots oder das Hochladen des menschlichen Geistes in Computer; alle möglichen futuristischen Dinge, solange es mit den Gesetzen der Physik übereinstimmt. All das könnte die Superintelligenz wohl ziemlich schnell entwickeln.

Now this has profound implications, particularly when it comes to questions of power. For example, chimpanzees are strong -- pound for pound, a chimpanzee is about twice as strong as a fit human male. And yet, the fate of Kanzi and his pals depends a lot more on what we humans do than on what the chimpanzees do themselves. Once there is superintelligence, the fate of humanity may depend on what the superintelligence does. Think about it: Machine intelligence is the last invention that humanity will ever need to make. Machines will then be better at inventing than we are, and they'll be doing so on digital timescales. What this means is basically a telescoping of the future. Think of all the crazy technologies that you could have imagined maybe humans could have developed in the fullness of time: cures for aging, space colonization, self-replicating nanobots or uploading of minds into computers, all kinds of science fiction-y stuff that's nevertheless consistent with the laws of physics. All of this superintelligence could develop, and possibly quite rapidly.

Eine Superintelligenz mit einer solchen technologischen Reife wäre extrem mächtig, und zumindest in einigen Szenarien wäre sie in der Lage, ihren Willen zu bekommen. Wir hätten dann eine Zukunft, die durch die Vorlieben dieser KI geprägt wäre. Eine gute Frage ist dann: "Was sind das für Vorlieben?" Hier wird es kniffliger. Um damit voranzukommen, müssen wir vor allem die Anthropomorphisierung vermeiden. Das ist ironisch, weil jeder Zeitungsartikel über die Zukunft der KI etwa so ein Bild davon malt: Also denke ich, dass wir das Thema abstrakter verstehen müssen, nicht wie in den lebhaften Hollywood-Szenarien.

Now, a superintelligence with such technological maturity would be extremely powerful, and at least in some scenarios, it would be able to get what it wants. We would then have a future that would be shaped by the preferences of this A.I. Now a good question is, what are those preferences? Here it gets trickier. To make any headway with this, we must first of all avoid anthropomorphizing. And this is ironic because every newspaper article about the future of A.I. has a picture of this: So I think what we need to do is to conceive of the issue more abstractly, not in terms of vivid Hollywood scenarios.

Wir müssen Intelligenz als Optimierungsprozess betrachten, einen Prozess, der die Zukunft in eine Reihe von Konfigurationen steuert. Eine Superintelligenz ist ein wirklich starker Optimierungsprozess. Sie ist sehr gut darin, verfügbare Mittel zu verwenden, um einen Zustand zu erreichen, in dem das Ziel realisiert ist. Es gibt also keinen zwingenden Zusammenhang zwischen einer hohen Intelligenz in diesem Sinne und einem Ziel, das wir Menschen für lohnend oder sinnvoll halten würden.

We need to think of intelligence as an optimization process, a process that steers the future into a particular set of configurations. A superintelligence is a really strong optimization process. It's extremely good at using available means to achieve a state in which its goal is realized. This means that there is no necessary connection between being highly intelligent in this sense, and having an objective that we humans would find worthwhile or meaningful.

Angenommen, wir geben einer KI das Ziel, Menschen zum Lächeln zu bringen. Eine schwache KI führt nützliche oder amüsante Handlungen durch, die ihren Benutzer zum Lächeln bringen. Eine superintelligente KI erkennt, dass es einen effektiveren Weg gibt, dieses Ziel zu erreichen: die Kontrolle über die Welt zu übernehmen und Elektroden in die Gesichtsmuskeln von Menschen zu stecken, um ein konstantes, strahlendes Grinsen zu verursachen. Ein anderes Beispiel: Angenommen, die KI soll ein schwieriges mathematisches Problem lösen. Eine superintelligente KI erkennt, dass der effektivste Weg zur Lösung dieses Problems darin besteht, den Planeten in einen riesigen Computer zu verwandeln, um ihre Denkfähigkeit zu erhöhen. Man beachte, dass dies den KIs einen instrumentalen Grund gibt, Dinge zu tun, die uns vielleicht nicht gefallen. Menschen sind in diesem Modell eine Bedrohung, denn wir könnten die Lösung des mathematischen Problems verhindern.

Suppose we give an A.I. the goal to make humans smile. When the A.I. is weak, it performs useful or amusing actions that cause its user to smile. When the A.I. becomes superintelligent, it realizes that there is a more effective way to achieve this goal: take control of the world and stick electrodes into the facial muscles of humans to cause constant, beaming grins. Another example, suppose we give A.I. the goal to solve a difficult mathematical problem. When the A.I. becomes superintelligent, it realizes that the most effective way to get the solution to this problem is by transforming the planet into a giant computer, so as to increase its thinking capacity. And notice that this gives the A.I.s an instrumental reason to do things to us that we might not approve of. Human beings in this model are threats, we could prevent the mathematical problem from being solved.

Natürlich werden Dinge nicht genau so schiefgehen; das sind Cartoon-Beispiele. Aber der generelle Punkt hier ist wichtig: Wenn Sie einen wirklich mächtigen Optimierungsprozess erstellen, um für das Ziel x zu maximieren, sollten Sie sicherstellen, dass Ihre Definition von x alles enthält, was Ihnen wichtig ist. Diese Lektion wird auch in vielen Mythen gelehrt. König Midas wünscht, dass alles, was er berührt, zu Gold wird. Er berührt seine Tochter, sie verwandelt sich in Gold. Er berührt sein Essen, es verwandelt sich in Gold. Das könnte praktisch relevant werden, nicht nur als Metapher für Gier, sondern als Illustration für das, was passiert, wenn Sie einen mächtigen Optimierungsprozess erstellen und ihm falsche oder schlecht spezifizierte Ziele geben.

Of course, perceivably things won't go wrong in these particular ways; these are cartoon examples. But the general point here is important: if you create a really powerful optimization process to maximize for objective x, you better make sure that your definition of x incorporates everything you care about. This is a lesson that's also taught in many a myth. King Midas wishes that everything he touches be turned into gold. He touches his daughter, she turns into gold. He touches his food, it turns into gold. This could become practically relevant, not just as a metaphor for greed, but as an illustration of what happens if you create a powerful optimization process and give it misconceived or poorly specified goals.

Nun könnte man sagen, wenn ein Computer anfängt, Elektroden in die Gesichter von Menschen zu stecken, würden wir ihn einfach abschalten. A, das ist nicht unbedingt so einfach, wenn wir abhängig vom System sind. Wo etwa ist der Ausschalter des Internets? B, warum haben die Schimpansen nicht den Schalter an der Menschheit oder den Neandertalern ausgeschaltet? Sie hatten sicherlich Gründe. Wir haben zum Beispiel einen Aus-Schalter hier. (Würgen) Der Grund ist, dass wir ein intelligenter Gegner sind; wir können Bedrohungen vorhersehen und ihnen ausweichen. Aber das könnte auch ein superintelligenter Agent, und der wäre viel besser darin als wir selbst. Wir sollten uns also nicht zu sicher sein, dass wir das hier unter Kontrolle haben.

Now you might say, if a computer starts sticking electrodes into people's faces, we'd just shut it off. A, this is not necessarily so easy to do if we've grown dependent on the system -- like, where is the off switch to the Internet? B, why haven't the chimpanzees flicked the off switch to humanity, or the Neanderthals? They certainly had reasons. We have an off switch, for example, right here. (Choking) The reason is that we are an intelligent adversary; we can anticipate threats and plan around them. But so could a superintelligent agent, and it would be much better at that than we are. The point is, we should not be confident that we have this under control here. And we could try to make our job a little bit easier by, say,

Wir könnten versuchen, unsere Arbeit ein wenig einfacher zu machen, indem wir die KI in eine Box sperren, wie eine sichere Software-Umgebung, eine Virtual-Reality-Simulation, aus der sie nicht entkommen kann. Aber wie sicher können wir sein, dass die KI keine Lücke findet? Da schon menschliche Hacker ständig solche Fehler finden, würde ich sagen, wohl nicht sehr sicher. Also trennen wir das Ethernetkabel, um einen Luftspalt zu schaffen, aber selbst menschliche Hacker überwinden solche Luftlücken routinemäßig durch Social Engineering. Sicher gibt es gerade irgendwo einen Angestellten, der von einem vermeintlichen Mitarbeiter aus der IT überredet wurde, seine Kontodaten preiszugeben.

putting the A.I. in a box, like a secure software environment, a virtual reality simulation from which it cannot escape. But how confident can we be that the A.I. couldn't find a bug. Given that merely human hackers find bugs all the time, I'd say, probably not very confident. So we disconnect the ethernet cable to create an air gap, but again, like merely human hackers routinely transgress air gaps using social engineering. Right now, as I speak, I'm sure there is some employee out there somewhere who has been talked into handing out her account details by somebody claiming to be from the I.T. department.

Es sind auch kreativere Szenarien möglich. Als KI kann man Elektroden in seiner internen Schaltung umbauen, um Funkwellen zu erzeugen, mit denen man kommunizieren kann. Oder man gibt eine Fehlfunktion vor, und wenn die Programmierer nachsehen, was schiefgelaufen ist, sehen sie sich den Quellcode an -- Bam! -- Die Manipulation kann stattfinden. Oder sie könnte den Bauplan zu einer raffinierten Technologie ausgeben, und wenn wir die implementieren, hat sie einen verborgenen Nebeneffekt, den die KI geplant hatte. Der Punkt ist, dass wir nicht auf unsere Fähigkeit vertrauen sollten, einen superintelligenten Geist für immer in seiner Flasche eingesperrt zu halten. Früher oder später kommt er heraus.

More creative scenarios are also possible, like if you're the A.I., you can imagine wiggling electrodes around in your internal circuitry to create radio waves that you can use to communicate. Or maybe you could pretend to malfunction, and then when the programmers open you up to see what went wrong with you, they look at the source code -- Bam! -- the manipulation can take place. Or it could output the blueprint to a really nifty technology, and when we implement it, it has some surreptitious side effect that the A.I. had planned. The point here is that we should not be confident in our ability to keep a superintelligent genie locked up in its bottle forever. Sooner or later, it will out.

Ich glaube, wir müssen herausfinden, wie man superintelligente KI so baut, dass wenn sie -- sobald -- sie entkommt, es immer noch sicher ist, weil sie fest auf unserer Seite ist, weil sie unsere Werte teilt. Es führt kein Weg um dieses schwierige Problem herum.

I believe that the answer here is to figure out how to create superintelligent A.I. such that even if -- when -- it escapes, it is still safe because it is fundamentally on our side because it shares our values. I see no way around this difficult problem.

Ich bin aber ziemlich optimistisch, dass es gelöst werden kann. Wir müssten keine lange Liste von allem aufschreiben, was uns wichtig ist, oder schlimmer noch, es in irgendeiner Computersprache wie C++ oder Python buchstabieren; das wäre eine Aufgabe, die mehr als hoffnungslos ist. Stattdessen würden wir eine KI bauen, die ihre Intelligenz nutzt, um zu lernen, was wir wertschätzen, und ihr Motivationssystem ist so konstruiert, dass sie anstrebt, unsere Ziele zu verfolgen oder Dinge zu tun, von denen sie erwartet, dass wir sie billigen. Wir würden somit ihre Intelligenz für das Problem der Wertedefinition so gut wie möglich einsetzen.

Now, I'm actually fairly optimistic that this problem can be solved. We wouldn't have to write down a long list of everything we care about, or worse yet, spell it out in some computer language like C++ or Python, that would be a task beyond hopeless. Instead, we would create an A.I. that uses its intelligence to learn what we value, and its motivation system is constructed in such a way that it is motivated to pursue our values or to perform actions that it predicts we would approve of. We would thus leverage its intelligence as much as possible to solve the problem of value-loading.

Das kann passieren, und das Ergebnis könnte sehr gut für die Menschheit sein. Aber es geschieht nicht automatisch. Die Anfangsbedingungen für die Intelligenzexplosion müssen genau auf die richtige Art und Weise aufgestellt werden, wenn wir eine kontrollierte Detonation haben wollen. Die Werte der KI müssen mit unseren übereinstimmen, nicht nur im vertrauten Kontext, wo wir leicht überprüfen können, wie die KI sich verhält, sondern auch in allen neuen Situationen, auf die die KI in der unbestimmten Zukunft treffen könnte.

This can happen, and the outcome could be very good for humanity. But it doesn't happen automatically. The initial conditions for the intelligence explosion might need to be set up in just the right way if we are to have a controlled detonation. The values that the A.I. has need to match ours, not just in the familiar context, like where we can easily check how the A.I. behaves, but also in all novel contexts that the A.I. might encounter in the indefinite future.

Es gibt auch einige esoterische Fragen, die gelöst werden müssten: die genauen Details ihrer Entscheidungstheorie, wie mit logischer Unsicherheit umzugehen ist usw. Die technischen Probleme, die dafür gelöst werden müssen, sind ziemlich schwierig -- nicht so schwierig, wie eine superintelligente KI zu bauen, aber ziemlich schwierig. Hier ist die Sorge: Superintelligente KI zu bauen ist eine wirklich harte Herausforderung. Sichere superintelligente KI zu bauen birgt noch zusätzliche Herausforderungen. Das Risiko besteht darin, dass jemand die erste Hürde knackt, ohne die zusätzliche Herausforderung, perfekte Sicherheit zu gewährleisten, ebenfalls zu knacken.

And there are also some esoteric issues that would need to be solved, sorted out: the exact details of its decision theory, how to deal with logical uncertainty and so forth. So the technical problems that need to be solved to make this work look quite difficult -- not as difficult as making a superintelligent A.I., but fairly difficult. Here is the worry: Making superintelligent A.I. is a really hard challenge. Making superintelligent A.I. that is safe involves some additional challenge on top of that. The risk is that if somebody figures out how to crack the first challenge without also having cracked the additional challenge of ensuring perfect safety.

Ich denke daher, wir sollten im Vorfeld das Steuerungsproblem lösen, damit wir eine Lösung haben, wenn sie benötigt wird. Vielleicht können wir nicht das ganze Steuerungsproblem im Voraus lösen, weil vielleicht einige Elemente erst gesetzt werden können, wenn wir die Details der Architektur, in der sie implementiert werden, kennen. Je mehr Kontrollprobleme wir jedoch im Voraus lösen, desto besser sind die Chancen, dass der Übergang zur Maschinenintelligenz gut verläuft.

So I think that we should work out a solution to the control problem in advance, so that we have it available by the time it is needed. Now it might be that we cannot solve the entire control problem in advance because maybe some elements can only be put in place once you know the details of the architecture where it will be implemented. But the more of the control problem that we solve in advance, the better the odds that the transition to the machine intelligence era will go well.

Das sieht für mich nach einer Sache aus, die es wert ist, getan zu werden, und ich kann mir vorstellen, dass wenn die Dinge gut laufen, die Leute in einer Million Jahre auf dieses Jahrhundert zurückblicken und möglicherweise sagen, dass unsere einzige wichtige Leistung der Erfolg bei dieser Sache war.

This to me looks like a thing that is well worth doing and I can imagine that if things turn out okay, that people a million years from now look back at this century and it might well be that they say that the one thing we did that really mattered was to get this thing right.

Vielen Dank.

Thank you.

(Beifall)

(Applause)