Stuart Russell: 3 principles for creating safer AI

This is Lee Sedol. Lee Sedol is one of the world's greatest Go players, and he's having what my friends in Silicon Valley call a "Holy Cow" moment --

Burada gördüğünüz Lee Sedol. Lee Sedol dünyadaki en iyi Go oyuncularından biri. Burada Silikon Vadisi'ndeki arkadaşlarımın deyişiyle bir 'Aman Tanrım' anı yaşıyor

(Laughter)

(Gülüşmeler)

a moment where we realize that AI is actually progressing a lot faster than we expected. So humans have lost on the Go board. What about the real world?

Bu, yapay zekanın beklediğimizden çok daha hızlı ilerlediğini anladığımız bir an. İnsanlar Go tahtasında kaybetti. Peki ya gerçek hayatta?

Well, the real world is much bigger, much more complicated than the Go board. It's a lot less visible, but it's still a decision problem. And if we think about some of the technologies that are coming down the pike ... Noriko [Arai] mentioned that reading is not yet happening in machines, at least with understanding. But that will happen, and when that happens, very soon afterwards, machines will have read everything that the human race has ever written. And that will enable machines, along with the ability to look further ahead than humans can, as we've already seen in Go, if they also have access to more information, they'll be able to make better decisions in the real world than we can. So is that a good thing? Well, I hope so.

Gerçek dünya Go tahtasından çok daha büyük ve çok daha karmaşık. Daha az şeffaf ama yine de bir karar sorunu oluşturuyor. Eğer gelmekte olan bazı teknolojileri düşünecek olursak... Noriko [Arai] makinelerin henüz okuyamadığından bahsetti; en azından okuduklarını anlamadıklarından Ama bu gerçekleşecek ve bunun gerçekleşmesinin hemen ardından makineler, insan türünün tarih boyunca yazdığı her şeyi okumuş olacak. Bu da makinelere insanlardan daha öteye bakma yetisini verecek. Go oyununda gördüğümüz gibi eğer daha fazla bilgiye erişebilirlerse, gerçek hayatta bizden daha iyi kararlar alabilecekler. Peki bu iyi bir şey mi? Umarım öyledir.

Our entire civilization, everything that we value, is based on our intelligence. And if we had access to a lot more intelligence, then there's really no limit to what the human race can do. And I think this could be, as some people have described it, the biggest event in human history. So why are people saying things like this, that AI might spell the end of the human race? Is this a new thing? Is it just Elon Musk and Bill Gates and Stephen Hawking?

Tüm uygarlığımız, değer verdiğimiz her şey zekamıza dayanıyor. Eğer bizlerin daha fazla zekaya erişebilme şansı olsa insan türünün yapabileceği şeylerin bir sınırı olmayacak. Sanırım bu, bazılarının tanımladığı gibi dünya tarihindeki en önemli olay olacaktır. Peki neden bazıları yapay zekanın insan türünün sonunu getirebileceğini söylüyor? Bu yeni bir endişe mi? Yani bunu düşünen sadece Elon Musk, Bill Gates ve Stephen Hawking mi?

Actually, no. This idea has been around for a while. Here's a quotation: "Even if we could keep the machines in a subservient position, for instance, by turning off the power at strategic moments" -- and I'll come back to that "turning off the power" idea later on -- "we should, as a species, feel greatly humbled." So who said this? This is Alan Turing in 1951. Alan Turing, as you know, is the father of computer science and in many ways, the father of AI as well. So if we think about this problem, the problem of creating something more intelligent than your own species, we might call this "the gorilla problem," because gorillas' ancestors did this a few million years ago, and now we can ask the gorillas: Was this a good idea?

Aslında hayır. Bu fikir uzun zamandır var. İşte size bir alıntı: ''Makineleri itaatkâr bir pozisyonda tutabiliyor olsak bile -örneğin güç ünitelerini stratejik anlarda kapatarak- (birazdan bu ''güç ünitesini kapama'' fikrine geri döneceğim) insan türü olarak bizler çok aşağılanmış gibi hissedeceğiz." Peki bunu kim söyledi? 1951 yılında Alan Turing. Alan Turing bildiğiniz gibi bilgisayar biliminin ve ayrıca pek çok yönden yapay zekanın da babasıdır. Eğer bu problemi kendi türümüzden daha zeki bir tür yaratma problemi olarak düşünürsek, buna ''goril problemi'' de diyebiliriz. Çünkü gorillerin ataları bunu birkaç milyon yıl önce yaptı: Gorillere bugün şu soruyu sorabiliriz: Bu iyi bir fikir miydi?

So here they are having a meeting to discuss whether it was a good idea, and after a little while, they conclude, no, this was a terrible idea. Our species is in dire straits. In fact, you can see the existential sadness in their eyes.

Burada toplantıdalar ve bu fikri değerlendiriyorlar. Sonunda da bunun kötü bir fikir olduğuna karar veriyorlar. Türümüz zor durumda. Aslında gözlerindeki varoluşsal üzüntüyü görebiliyorsunuz.

(Laughter)

(Gülüşmeler)

So this queasy feeling that making something smarter than your own species is maybe not a good idea -- what can we do about that? Well, really nothing, except stop doing AI, and because of all the benefits that I mentioned and because I'm an AI researcher, I'm not having that. I actually want to be able to keep doing AI.

Kendi türünüzden daha zeki bir şey yaratmanın pek de iyi bir fikir olmayabileceğine dair tatsız bir his... Bu konuda ne yapabiliriz? Doğrusu, yapay zeka ile uğraşmayı bırakmaktan başka bir şey yapamayız. Ama az önce sözünü ettiğim tüm yararlarından dolayı ve ben de bir YZ araştırmacısı olduğumdan, buna katılmıyorum. Aslında ben YZ yapmaya çalışmayı sürdürmek istiyorum.

So we actually need to nail down the problem a bit more. What exactly is the problem? Why is better AI possibly a catastrophe?

Yani aslında soruna biraz daha yakından bakmalıyız. Sorun tam olarak nedir? Neden daha iyi bir YZ muhtemelen bir felaket olsun?

So here's another quotation: "We had better be quite sure that the purpose put into the machine is the purpose which we really desire." This was said by Norbert Wiener in 1960, shortly after he watched one of the very early learning systems learn to play checkers better than its creator. But this could equally have been said by King Midas. King Midas said, "I want everything I touch to turn to gold," and he got exactly what he asked for. That was the purpose that he put into the machine, so to speak, and then his food and his drink and his relatives turned to gold and he died in misery and starvation. So we'll call this "the King Midas problem" of stating an objective which is not, in fact, truly aligned with what we want. In modern terms, we call this "the value alignment problem."

İşte bir başka alıntı: ''Makinelere verdiğimiz amaçların aslında bizlerin arzuladığı amaçlar olduğuna emin olmalıyız.'' Norbert Wiener bunu 1960 yılında, ilk öğrenim sistemlerinden birinin, dama oynamayı, yaratıcısından daha iyi başardığını izledikten sonra söylemişti. Ama bu pekâlâ Kral Midas tarafından da söylenmiş olabilirdi. Kral Midas şöyle demişti: ''Dokunduğum her şeyin altın olmasını istiyorum,'' ve istediğini tam olarak elde etti. Onun makineye verdiği amacın bu olduğu söylenebilir. Bunun üstüne, yiyeceği ve içeceği ve akrabaları altına dönüştü. Kendisi de açlık ve sefalet içinde öldü. O zaman, belirttiğimiz ereklerin, aslında istediğimiz şey ile gerçekte uyuşmaması durumuna ''Kral Midas Problemi'' diyelim. Modern terimlerle buna ''değer uyuşmazlığı problemi'' diyoruz.

Putting in the wrong objective is not the only part of the problem. There's another part. If you put an objective into a machine, even something as simple as, "Fetch the coffee," the machine says to itself, "Well, how might I fail to fetch the coffee? Someone might switch me off. OK, I have to take steps to prevent that. I will disable my 'off' switch. I will do anything to defend myself against interference with this objective that I have been given." So this single-minded pursuit in a very defensive mode of an objective that is, in fact, not aligned with the true objectives of the human race -- that's the problem that we face. And in fact, that's the high-value takeaway from this talk. If you want to remember one thing, it's that you can't fetch the coffee if you're dead.

Problem, yanlış amaç yerleştirmekten ibaret değil. Bir yönü daha var. Bir makineye bir amaç verirseniz, örneğin ''kahvemi getir'' gibi basit bir hedef olsa bile makine kendi kendine şöyle der: ''Kahveyi getirmekte nasıl başarısızlığa uğrayabilirim? Biri beni kapatabilir. Peki. Bunu önlemek için bir şeyler yapmalıyım. 'Kapama' düğmemi işlev dışı bırakacağım. Bana verilen komutu yerine getirmemi engelleyebilecek her türlü müdahaleye karşı kendimi savunmak için ne gerekiyorsa yapacağım." Sonuç olarak, gayet savunmacı bir moddaki bu tek-hedef odaklı hareket, insan türünün gerçek hedefleriyle örtüşmüyor. Karşı karşıya kaldığımız sorun aslında bu. Doğrusu, tam da bu, bu konuşmanın en değerli dersi. Hatırlamanız gereken bir şey varsa, o da ölüyken kahve getiremeyeceğiniz olmalı.

(Laughter)

(Gülüşmeler)

It's very simple. Just remember that. Repeat it to yourself three times a day.

Bu çok basit. Sadece bunu hatırlayın. Kendinize bunu günde üç kez tekrar edin.

(Laughter)

(Gülüşmeler)

And in fact, this is exactly the plot of "2001: [A Space Odyssey]" HAL has an objective, a mission, which is not aligned with the objectives of the humans, and that leads to this conflict. Now fortunately, HAL is not superintelligent. He's pretty smart, but eventually Dave outwits him and manages to switch him off. But we might not be so lucky. So what are we going to do?

Aslında, bu tam da ''2001: [Bir Uzay Destanı]'' filmindeki hikayedir. HAL'in insanların hedefleriyle örtüşmeyen bir hedefi, bir misyonu vardır. Bu da bir çatışmaya neden olur. Neyse ki, HAL süper-zekalı değildir. Çok zeki olsa da, sonunda Dave onu atlatır ve onu kapatmayı başarır. Ama biz o kadar şanslı olamayabiliriz. Peki ne yapacağız o zaman?

I'm trying to redefine AI to get away from this classical notion of machines that intelligently pursue objectives. There are three principles involved. The first one is a principle of altruism, if you like, that the robot's only objective is to maximize the realization of human objectives, of human values. And by values here I don't mean touchy-feely, goody-goody values. I just mean whatever it is that the human would prefer their life to be like. And so this actually violates Asimov's law that the robot has to protect its own existence. It has no interest in preserving its existence whatsoever.

Ben komutları zekice izleyen makinelerle ilgili bu klasik düşünceden kurtulmamız için yapay zekayı yeniden tanımlamaya çalışıyorum. Bununla ilgili üç ilke var. Birincisi, çıkar gözetmemezlik de denilebilecek bir ilke. Yani robotların tek hedeflerinin insanların hedeflerine, insanların değerlerine hitap etmek olması. 'Değerler' derken, samimiyete dayanan değerleri kastetmiyorum. Sadece, insanların hayatlarının nasıl olması gerektiği hakkındaki tercihlerini kastetiyorum. Bu, aslında Asimov'un "robotların kendi varlıklarını korumaları gerektiği" kuralını çiğniyor. Robotlar, kendi varlıklarını korumakla ilgilenmez.

The second law is a law of humility, if you like. And this turns out to be really important to make robots safe. It says that the robot does not know what those human values are, so it has to maximize them, but it doesn't know what they are. And that avoids this problem of single-minded pursuit of an objective. This uncertainty turns out to be crucial.

İkinci kuralın mütevazilikle ilgili olduğu söylenebilir. Bunun, bir robotun güvenilir olmasında büyük katkısı olduğu ortaya çıktı. Bu kural, robotların insanların değerlerinin ne olduğunu bilmeden onları arttırmaya çalıştığını söylüyor. Yani bilmediği değerleri arttırmaya çalışıyor. Bu da tek-hedefli uğraş problemini önler. Bu belirsizliğin çok önemli olduğu ortaya çıktı.

Now, in order to be useful to us, it has to have some idea of what we want. It obtains that information primarily by observation of human choices, so our own choices reveal information about what it is that we prefer our lives to be like. So those are the three principles. Let's see how that applies to this question of: "Can you switch the machine off?" as Turing suggested.

Bunun bize yararlı olabilmesi için robotların bizlerin ne isteyebileceği hakkında bir fikri olmalı. Bu bilgiyi de birincil olarak insanların tercihlerini gözlemleyerek elde eder. Dolayısıyla bizim tercihlerimiz, yaşamlarımızın nasıl olmasını istediğimize ilişkin bilgi sunuyor. İşte üç ilke bunlar. O zaman bu ilkelerin, Turing'in önerdiği ''makineyi kapatabilir misin?'' sorusuna nasıl uygulandığını inceleyelim.

So here's a PR2 robot. This is one that we have in our lab, and it has a big red "off" switch right on the back. The question is: Is it going to let you switch it off? If we do it the classical way, we give it the objective of, "Fetch the coffee, I must fetch the coffee, I can't fetch the coffee if I'm dead," so obviously the PR2 has been listening to my talk, and so it says, therefore, "I must disable my 'off' switch, and probably taser all the other people in Starbucks who might interfere with me."

Burada gördüğünüz bir PR2 robotu. Bu gördüğünüz bizim laboratuvarımızdaki PR2. Gördüğünüz gibi arkasında kocaman kırmızı bir ''kapat'' düğmesi var. Sorumuz şu: Onu kapatmanıza izin verecek mi? Bunu klasik şekilde yapacak olursak ve ona ''kahveyi getir, kahveyi getirmeliyim, ölürsem kahveyi getiremem'' amacını verirsek, tabi PR2 konuşmamı dinlemekteydi, dolayısıyla ''kapama düğmemi işlev dışı bırakmalıyım ve Starbucks'ta bana müdahale etmeye çalışan diğer insanları da belki elektrikle şoklarım'' der.

(Laughter)

(Gülüşmeler)

So this seems to be inevitable, right? This kind of failure mode seems to be inevitable, and it follows from having a concrete, definite objective.

Yani bu kaçınılmaz görünüyor, değil mi? Bu çeşit bir hata modu kaçınılmaz görünüyor ve bu da kesin ve net bir hedef konulmasından kaynaklanıyor.

So what happens if the machine is uncertain about the objective? Well, it reasons in a different way. It says, "OK, the human might switch me off, but only if I'm doing something wrong. Well, I don't really know what wrong is, but I know that I don't want to do it." So that's the first and second principles right there. "So I should let the human switch me off." And in fact you can calculate the incentive that the robot has to allow the human to switch it off, and it's directly tied to the degree of uncertainty about the underlying objective.

Peki, eğer makine hedef hakkında emin değilse ne olur? O zaman daha farklı bir mantık yürütür. O zaman der ki, ''Peki, insan beni kapatabilir ama sadece yanlış bişey yaparsam. Ben neyin yanlış olduğunu tam bilmiyorum ama yanlış yapmak istemediğimi biliyorum.'' Burada birinci ve ikinci ilkeleri görebiliyorsunuz. ''Öyleyse insanın beni kapatmasına izin vermeliyim.'' Aslında bir robotun insana kendini kapatması için izin vermesindeki motivasyonu hesaplayabilirsiniz. Bu doğrudan, komutun belirsizlik derecesinin ne olduğuna bağlıdır.

And then when the machine is switched off, that third principle comes into play. It learns something about the objectives it should be pursuing, because it learns that what it did wasn't right. In fact, we can, with suitable use of Greek symbols, as mathematicians usually do, we can actually prove a theorem that says that such a robot is provably beneficial to the human. You are provably better off with a machine that's designed in this way than without it. So this is a very simple example, but this is the first step in what we're trying to do with human-compatible AI.

Sonra makine kapatılınca, üçüncü ilke devreye girer. Çünkü izlediği amaçlara ilişkin bir şey öğrenir, çünkü yapmış olduğu şeyin yanlış olduğunu öğrenir. Aslında, matematikçilerin sıkça yaptığı gibi Yunan harflerinin uygun kullanımıyla, böyle bir robotun, insanlara yararının kanıtlanabileceğini söyleyen bir teoremi kanıtlayabiliriz. Bu şekilde tasarlanmış bir makinenin varlığı, yokluğuna oranla sizi daha iyi kılacaktır. Bu çok basit örnek aslında başarmaya çalıştığımız insan-uyumlu YZ yolunda ilk adımımızdır.

Now, this third principle, I think is the one that you're probably scratching your head over. You're probably thinking, "Well, you know, I behave badly. I don't want my robot to behave like me. I sneak down in the middle of the night and take stuff from the fridge. I do this and that." There's all kinds of things you don't want the robot doing. But in fact, it doesn't quite work that way. Just because you behave badly doesn't mean the robot is going to copy your behavior. It's going to understand your motivations and maybe help you resist them, if appropriate. But it's still difficult. What we're trying to do, in fact, is to allow machines to predict for any person and for any possible life that they could live, and the lives of everybody else: Which would they prefer? And there are many, many difficulties involved in doing this; I don't expect that this is going to get solved very quickly. The real difficulties, in fact, are us.

Şimdi bu üçüncü ilke biraz kafanızı karıştırıyor olabilir. Herhalde şunu düşünüyorsunuz: ''Eğer ben yanlış hareket ediyorsam, robotumun da benim gibi hareket etmesini istemiyorum. Geceleri usulca gidip buzluktan bişeyler aşırıyorum. Bunun gibi şeyler yapıyorum.'' Robotun yapmamasını istediğiniz birçok şey olabilir. Ama aslında bu pek öyle çalışmıyor. Sırf siz kötü bir şey yaptınız diye, robot da sizin hareketinizi taklit edecek değil. Sizi buna iteni anlayacak ve belki de direnmeniz için size yardım edecek, eğer uygunsa. Ama bu hâlâ zor. Aslında yapmaya çalıştığımız şey, makinelerin herhangi biri ve yaşayabilecekleri herhangi bir olası yaşam ve diğer herkesin yaşamı için öngörüler yapmalarını sağlamak: Neyi tercih ederlerdi? Bunu yapmak konusunda çok fazla güçlük var. Bunun yakın zamanda çözüleceğini de sanmıyorum. Gerçek zorluk, aslında biziz.

As I have already mentioned, we behave badly. In fact, some of us are downright nasty. Now the robot, as I said, doesn't have to copy the behavior. The robot does not have any objective of its own. It's purely altruistic. And it's not designed just to satisfy the desires of one person, the user, but in fact it has to respect the preferences of everybody. So it can deal with a certain amount of nastiness, and it can even understand that your nastiness, for example, you may take bribes as a passport official because you need to feed your family and send your kids to school. It can understand that; it doesn't mean it's going to steal. In fact, it'll just help you send your kids to school.

Bahsettiğim gibi, kötü davranışlarımız var. Aslına bakarsanız, bazılarımız gerçekten kötü. Dediğim gibi robot bu davranışları tekrarlamak zorunda değil. Robotun kendine ait bir hedefi yok. Onlar bütünüyle fedakâr. Bu sadece bir kişinin, kullanıcının arzularını tatmin etmek üzere tasarlanmış değil, aslında herkesin tercihlerine saygı duymak için tasarlanırlar. Makineler bir kısım kötü davranışı algılayabilir ve hatta kötü davranışın ardındaki nedenleri de anlayabilir. Örneğin görev başında rüşvet alan bir pasaport memuruysanız, bunu aile geçindirmek ve çocukları okula göndermek için yaptığınızı anlayabilir. Bu onun da çalacağı anlamına gelmez. Aslında çocuklarınızı okula gönderebilmeniz için size yardım edecek.

We are also computationally limited. Lee Sedol is a brilliant Go player, but he still lost. So if we look at his actions, he took an action that lost the game. That doesn't mean he wanted to lose. So to understand his behavior, we actually have to invert through a model of human cognition that includes our computational limitations -- a very complicated model. But it's still something that we can work on understanding.

Bizler ayrıca hesaplama açısından sınırlıyız. Lee Sedol parlak bir Go oyuncusu, ama yine de kaybetti. Hamlelerine bakılırsa, oyunu kaybetmesine neden olan bir hamle yaptı. Bu onun kaybetmek istediği anlamına gelmiyor. Yani onun davranışını anlamak için, bizim hesaplama konusundaki sınırlarımızı da içeren bir insan zekası modeli süzgecinden geçirmek gerek. Bu da çok karmaşık bir model. Ama yine de anlamaya çalışabileceğimiz bir kavram.

Probably the most difficult part, from my point of view as an AI researcher, is the fact that there are lots of us, and so the machine has to somehow trade off, weigh up the preferences of many different people, and there are different ways to do that. Economists, sociologists, moral philosophers have understood that, and we are actively looking for collaboration.

Bir YZ araştırmacısı olarak baktığımda, en zor olan kısım belki de aslında çok fazla sayıda insan oluşu. Makineler bir şekilde tüm farklı insanların tercihlerini ve bunların ağırlıklarını analiz etmeliler. Bunu yapmanın farklı yolları var. Ekonomistler, sosyologlar, filozoflar bunu anlamıştı ve biz de aktif olarak bir işbirliği arayışındayız.

Let's have a look and see what happens when you get that wrong. So you can have a conversation, for example, with your intelligent personal assistant that might be available in a few years' time. Think of a Siri on steroids. So Siri says, "Your wife called to remind you about dinner tonight." And of course, you've forgotten. "What? What dinner? What are you talking about?"

O zaman bunu yanlış anladığınızda neler olabileceğine bir bakalım. Örneğin zeki kişisel asistanınızla bir sohbet yapabilirsiniz; bu birkaç yıl içinde mümkün olabilir. Yani steroidler verilmiş bir Siri düşünün. Siri diyor ki, ''Eşin bu akşamki yemek randevunuzu hatırlatmak için aradı.'' Tabi siz bunu unutmuştunuz. ''Ne? Ne yemeği? Neden bahsediyorsun?''

"Uh, your 20th anniversary at 7pm."

''Bu akşam 7'de olan 20. yıldönümü yemeğiniz.''

"I can't do that. I'm meeting with the secretary-general at 7:30. How could this have happened?"

''Ama yemeğe katılamam, saat 7:30'da Genel Sekreter ile buluşuyorum. Bu nasıl olabildi?''

"Well, I did warn you, but you overrode my recommendation."

''Sizi daha önce bununla ilgili uyarmıştım ama önerimi çiğnediniz.''

"Well, what am I going to do? I can't just tell him I'm too busy."

''Peki ne yapacağım? Genel sekretere çok meşgul olduğumu söyleyemem.''

"Don't worry. I arranged for his plane to be delayed."

"Merak etme, uçağının gecikmesini sağlıyorum.''

(Laughter)

(Gülüşmeler)

"Some kind of computer malfunction."

''Bir çeşit bilgisayar hatası.''

(Laughter)

(Gülüşmeler)

"Really? You can do that?"

''Gerçekten mi? Bunu yapabiliyor musun?''

"He sends his profound apologies and looks forward to meeting you for lunch tomorrow."

''Size en içten üzüntülerini iletiyor ve sizinle yarın öğlen yemeğinde görüşmeyi bekliyor.''

(Laughter)

(Gülüşmeler)

So the values here -- there's a slight mistake going on. This is clearly following my wife's values which is "Happy wife, happy life."

Yani buradaki değerler... Yanlış olan bir şeyler var. Bu kesinlikle eşimin değerlerini takip etmekte, ki bu da ''mutlu bir eş, mutlu bir hayat'' oluyor.

(Laughter)

(Gülüşmeler)

It could go the other way. You could come home after a hard day's work, and the computer says, "Long day?"

Ama bu başka türlü de gidebilir. Yorucu bir iş gününden eve geldiğinizde, bilgisayara size ''uzun bir gün müydü?'' diye soruyor.

"Yes, I didn't even have time for lunch."

''Evet, öğle yemeğine bile vaktim olmadı.''

"You must be very hungry."

''Çok aç olmalısın.''

"Starving, yeah. Could you make some dinner?"

''Evet, açlıktan ölüyorum. Yiyecek bir şeyler hazırlayabilir misin?''

"There's something I need to tell you."

''Sana söylemem gereken bişey var.''

(Laughter)

(Gülüşmeler)

"There are humans in South Sudan who are in more urgent need than you."

''Güney Sudan'da yemeğe senden daha muhtaç insanlar var.''

(Laughter)

(Gülüşmeler)

"So I'm leaving. Make your own dinner."

''Bu yüzden ben ayrılıyorum. Kendi yemeğini kendin yap.''

(Laughter)

(Gülüşmeler)

So we have to solve these problems, and I'm looking forward to working on them.

Bu sorunları çözmeliyiz. Ben de bu sorunların üstünde çalışmayı sabırsızlıkla bekliyorum.

There are reasons for optimism. One reason is, there is a massive amount of data. Because remember -- I said they're going to read everything the human race has ever written. Most of what we write about is human beings doing things and other people getting upset about it. So there's a massive amount of data to learn from.

İyimser olmamız için sebeplerimiz var. Bir nedenimiz varolan büyük bir veri oluşu. Çünkü hatırlarsanız, insanların tarih boyunca yazdığı her şeyi okuyacaklarını söylemiştim. Yazdığımız çoğu şey, insanların yaptığı şeyler ve başkalarının bunlardan dolayı üzülmesi ile ilgili. Yani öğrenilecek birçok şey var.

There's also a very strong economic incentive to get this right. So imagine your domestic robot's at home. You're late from work again and the robot has to feed the kids, and the kids are hungry and there's nothing in the fridge. And the robot sees the cat.

Ayrıca bunu gerçekleştirmek için güçlü bir ekonomik itki de var. Evcil robotunuzun evinizde olduğunu varsayın. Yine evinize geç gelmişsiniz ve robotun çocuklara yemek yedirmesi lazım. Çocuklar aç ve buzdolabında hiçbir şey yok. Robot kedinizi görüyor.

(Laughter)

(Gülüşmeler)

And the robot hasn't quite learned the human value function properly, so it doesn't understand the sentimental value of the cat outweighs the nutritional value of the cat.

Robot henüz insanların değer yargılarını tam olarak kullanmayı öğrenemediğinden, kedinin duygusal değerinin, kedinin besin değerinden daha ağır bastığını anlamıyor.

(Laughter)

(Gülüşmeler)

So then what happens? Well, it happens like this: "Deranged robot cooks kitty for family dinner." That one incident would be the end of the domestic robot industry. So there's a huge incentive to get this right long before we reach superintelligent machines.

O zaman ne olur? Şöyle bir son dakika haberi: ''Bozuk robot akşam yemeği için kediyi pişirdi.'' Böyle tek bir olay, evcil robot endüstrisinin sonu olurdu. Bu yüzden süper-zeki makinelerden önce bunu hatasız yapmamız gerekiyor.

So to summarize: I'm actually trying to change the definition of AI so that we have provably beneficial machines. And the principles are: machines that are altruistic, that want to achieve only our objectives, but that are uncertain about what those objectives are, and will watch all of us to learn more about what it is that we really want. And hopefully in the process, we will learn to be better people. Thank you very much.

Kısaca özetleyecek olursak: Ben yapay zekanın tanımını değiştirmeye çalışıyorum, ki yararları kanıtlanmış makinelere sahip olabilelim. İlkeler şunlar: Fedakâr makineler, sadece bizim hedeflerimize ulaşmak istesinler. Ama bu hedeflerin ne olduğundan emin olmasınlar ve öğrenmek için hepimizi izleyip, gerçekte ne istediğimizi anlasınlar. Biz de ümit ederim ki bu süreçte daha iyi insanlar olmayı öğrenelim. Çok teşekkür ederim.

(Applause)

(Alkışlar)

Chris Anderson: So interesting, Stuart. We're going to stand here a bit because I think they're setting up for our next speaker.

Chris Anderson: Bu çok ilginçti Stuart. Seninle biraz burada duracağız, çünkü sanırım bir sonraki konuşmacı için sahneyi kuruyorlar.

A couple of questions. So the idea of programming in ignorance seems intuitively really powerful. As you get to superintelligence, what's going to stop a robot reading literature and discovering this idea that knowledge is actually better than ignorance and still just shifting its own goals and rewriting that programming?

Sana birkaç sorum var. Robotlara cehaleti de programlamak içgüdüsel olarak çok güçlü bir fikir gibi duruyor. Süper-zekaya yaklaştıkça bir robotun tüm araştırmaları okuyarak, bilginin aslında cehaletten daha iyi olduğunu keşfetmesi ve kendi hedeflerini belirleyip kendini programını yeniden yazmasını ne durdurabilir?

Stuart Russell: Yes, so we want it to learn more, as I said, about our objectives. It'll only become more certain as it becomes more correct, so the evidence is there and it's going to be designed to interpret it correctly. It will understand, for example, that books are very biased in the evidence they contain. They only talk about kings and princes and elite white male people doing stuff. So it's a complicated problem, but as it learns more about our objectives it will become more and more useful to us.

Stuart Russell: Evet, aslında dediğim gibi biz de onun kendi hedeflerimiz hakkında daha çok şey bilmesini istiyoruz. Onu, bunu doğru şekilde yorumlayacak biçimde tasarlayacağız. Örneğin, kitapların içerdikleri kanıtlar açısından son derece taraflı olduğunu anlayacak. Sadece kralların, prenslerin ve elit beyaz erkeklerin yaptıklarından bahsediyorlar. Yani bu karmaşık bir problem, ama bizim hedeflerimizi öğrendikçe, bize giderek daha da yararlı olacaklar.

CA: And you couldn't just boil it down to one law, you know, hardwired in: "if any human ever tries to switch me off, I comply. I comply."

CA: Ve bunu da tek bir kurala indirgeyemeyiz, değil mi, yani ''eğer insan beni kapatmayı denerse, uyacağım. Uyacağım.''

SR: Absolutely not. That would be a terrible idea. So imagine that you have a self-driving car and you want to send your five-year-old off to preschool. Do you want your five-year-old to be able to switch off the car while it's driving along? Probably not. So it needs to understand how rational and sensible the person is. The more rational the person, the more willing you are to be switched off. If the person is completely random or even malicious, then you're less willing to be switched off.

SR: Kesinlikle hayır. Bu berbat bir fikir olurdu. Yani kendini süren bir arabanız olduğunu varsayın ve beş yaşındaki çocuğunuzu anaokuluna göndermek istiyorsunuz. Peki beş yaşındaki çocuğunuzun araba seyir halindeyken arabayı kapatabilmesini ister miydiniz? Muhtemelen hayır. Yani bu durumda robotun, bir insanın ne kadar mantıklı ve rasyonel olduğunu anlaması gerekiyor. Bir insan ne kadar rasyonel ise siz de o kadar kapatılmayı kabul edersiniz. Eğer bir kişi tamamen plansız ve hatta kötü niyetliyse o zaman siz de kapatılmak için daha az hevesli olursunuz.

CA: All right. Stuart, can I just say, I really, really hope you figure this out for us. Thank you so much for that talk. That was amazing.

CA: Peki. Stuart, sadece şunu söyleyebilirim. Umarım bunu hepimiz için çözebilirsin. Konuşman için çok teşekkürler. Muhteşemdi.

SR: Thank you.

SR: Teşekkürler.

(Applause)

(Alkışlar)

This is Lee Sedol. Lee Sedol is one of the world's greatest Go players, and he's having what my friends in Silicon Valley call a "Holy Cow" moment --

Burada gördüğünüz Lee Sedol. Lee Sedol dünyadaki en iyi Go oyuncularından biri. Burada Silikon Vadisi'ndeki arkadaşlarımın deyişiyle bir 'Aman Tanrım' anı yaşıyor

(Laughter)

(Gülüşmeler)

a moment where we realize that AI is actually progressing a lot faster than we expected. So humans have lost on the Go board. What about the real world?

Bu, yapay zekanın beklediğimizden çok daha hızlı ilerlediğini anladığımız bir an. İnsanlar Go tahtasında kaybetti. Peki ya gerçek hayatta?

(Laughter)

(Gülüşmeler)

So we actually need to nail down the problem a bit more. What exactly is the problem? Why is better AI possibly a catastrophe?

Yani aslında soruna biraz daha yakından bakmalıyız. Sorun tam olarak nedir? Neden daha iyi bir YZ muhtemelen bir felaket olsun?

(Laughter)

(Gülüşmeler)

It's very simple. Just remember that. Repeat it to yourself three times a day.

Bu çok basit. Sadece bunu hatırlayın. Kendinize bunu günde üç kez tekrar edin.

(Laughter)

(Gülüşmeler)

(Laughter)

(Gülüşmeler)

So this seems to be inevitable, right? This kind of failure mode seems to be inevitable, and it follows from having a concrete, definite objective.

Yani bu kaçınılmaz görünüyor, değil mi? Bu çeşit bir hata modu kaçınılmaz görünüyor ve bu da kesin ve net bir hedef konulmasından kaynaklanıyor.

"Uh, your 20th anniversary at 7pm."

''Bu akşam 7'de olan 20. yıldönümü yemeğiniz.''

"I can't do that. I'm meeting with the secretary-general at 7:30. How could this have happened?"

''Ama yemeğe katılamam, saat 7:30'da Genel Sekreter ile buluşuyorum. Bu nasıl olabildi?''

"Well, I did warn you, but you overrode my recommendation."

''Sizi daha önce bununla ilgili uyarmıştım ama önerimi çiğnediniz.''

"Well, what am I going to do? I can't just tell him I'm too busy."

''Peki ne yapacağım? Genel sekretere çok meşgul olduğumu söyleyemem.''

"Don't worry. I arranged for his plane to be delayed."

"Merak etme, uçağının gecikmesini sağlıyorum.''

(Laughter)

(Gülüşmeler)

"Some kind of computer malfunction."

''Bir çeşit bilgisayar hatası.''

(Laughter)

(Gülüşmeler)

"Really? You can do that?"

''Gerçekten mi? Bunu yapabiliyor musun?''

"He sends his profound apologies and looks forward to meeting you for lunch tomorrow."

''Size en içten üzüntülerini iletiyor ve sizinle yarın öğlen yemeğinde görüşmeyi bekliyor.''

(Laughter)

(Gülüşmeler)

So the values here -- there's a slight mistake going on. This is clearly following my wife's values which is "Happy wife, happy life."

Yani buradaki değerler... Yanlış olan bir şeyler var. Bu kesinlikle eşimin değerlerini takip etmekte, ki bu da ''mutlu bir eş, mutlu bir hayat'' oluyor.

(Laughter)

(Gülüşmeler)

It could go the other way. You could come home after a hard day's work, and the computer says, "Long day?"

Ama bu başka türlü de gidebilir. Yorucu bir iş gününden eve geldiğinizde, bilgisayara size ''uzun bir gün müydü?'' diye soruyor.

"Yes, I didn't even have time for lunch."

''Evet, öğle yemeğine bile vaktim olmadı.''

"You must be very hungry."

''Çok aç olmalısın.''

"Starving, yeah. Could you make some dinner?"

''Evet, açlıktan ölüyorum. Yiyecek bir şeyler hazırlayabilir misin?''

"There's something I need to tell you."

''Sana söylemem gereken bişey var.''

(Laughter)

(Gülüşmeler)

"There are humans in South Sudan who are in more urgent need than you."

''Güney Sudan'da yemeğe senden daha muhtaç insanlar var.''

(Laughter)

(Gülüşmeler)

"So I'm leaving. Make your own dinner."

''Bu yüzden ben ayrılıyorum. Kendi yemeğini kendin yap.''

(Laughter)

(Gülüşmeler)

So we have to solve these problems, and I'm looking forward to working on them.

Bu sorunları çözmeliyiz. Ben de bu sorunların üstünde çalışmayı sabırsızlıkla bekliyorum.

(Laughter)

(Gülüşmeler)

And the robot hasn't quite learned the human value function properly, so it doesn't understand the sentimental value of the cat outweighs the nutritional value of the cat.

Robot henüz insanların değer yargılarını tam olarak kullanmayı öğrenemediğinden, kedinin duygusal değerinin, kedinin besin değerinden daha ağır bastığını anlamıyor.

(Laughter)

(Gülüşmeler)

(Applause)

(Alkışlar)

Chris Anderson: So interesting, Stuart. We're going to stand here a bit because I think they're setting up for our next speaker.

Chris Anderson: Bu çok ilginçti Stuart. Seninle biraz burada duracağız, çünkü sanırım bir sonraki konuşmacı için sahneyi kuruyorlar.

CA: And you couldn't just boil it down to one law, you know, hardwired in: "if any human ever tries to switch me off, I comply. I comply."

CA: Ve bunu da tek bir kurala indirgeyemeyiz, değil mi, yani ''eğer insan beni kapatmayı denerse, uyacağım. Uyacağım.''

CA: All right. Stuart, can I just say, I really, really hope you figure this out for us. Thank you so much for that talk. That was amazing.

CA: Peki. Stuart, sadece şunu söyleyebilirim. Umarım bunu hepimiz için çözebilirsin. Konuşman için çok teşekkürler. Muhteşemdi.

SR: Thank you.

SR: Teşekkürler.

(Applause)

(Alkışlar)

Stuart Russell: 3 principles for creating safer AI

Stuart Russell: 3 principles for creating safer AI

Related talks

Blaise Agüera y Arcas: How computers are learning to be creative

Sam Harris: Can we build AI without losing control over it?

Zeynep Tufekci: Machine intelligence makes human morals more important

Noriko Arai: Can a robot pass a university entrance exam?

David Lee: Why jobs of the future won't feel like work

Kriti Sharma: How to keep human bias out of AI

Related talks

Blaise Agüera y Arcas: How computers are learning to be creative

Sam Harris: Can we build AI without losing control over it?

Zeynep Tufekci: Machine intelligence makes human morals more important

Noriko Arai: Can a robot pass a university entrance exam?

David Lee: Why jobs of the future won't feel like work

Kriti Sharma: How to keep human bias out of AI