Blaise Agüera y Arcas: How computers are learning to be creative

So, I lead a team at Google that works on machine intelligence; in other words, the engineering discipline of making computers and devices able to do some of the things that brains do. And this makes us interested in real brains and neuroscience as well, and especially interested in the things that our brains do that are still far superior to the performance of computers.

მე Google-ში ვხელმძღვანელობ ჯგუფს, რომელიც მანქანურ ინტელექტზე მუშაობს. სხვა სიტყვებით, ესაა საინჟინრო დისციპლინა, რომელიც კომპიუტერებს და მოწყობილობებს აკეთებინებს იმას, რასაც ტვინები აკეთებენ. ამიტომ, ჩვენ ნამდვილი ტვინებითაც და ნეირომეცნიერებითაც ვართ დაინტერესებული. განსაკუთრებით ტვინის ისეთი ფუნქციებით, რომლებიც ბევრად აღემატება კომპიუტერებისას.

Historically, one of those areas has been perception, the process by which things out there in the world -- sounds and images -- can turn into concepts in the mind. This is essential for our own brains, and it's also pretty useful on a computer. The machine perception algorithms, for example, that our team makes, are what enable your pictures on Google Photos to become searchable, based on what's in them. The flip side of perception is creativity: turning a concept into something out there into the world. So over the past year, our work on machine perception has also unexpectedly connected with the world of machine creativity and machine art.

ისტორიულად, ერთ-ერთი ასეთი ფუნქცია აღქმაა. პროცესი, რომლის შედეგადაც მსოფლიოში არსებული ხმები და გამოსახულებები, ტვინში წარმოდგენებად იქცევა. ეს ჩვენი ტვინის აუცილებელი ფუნქციაა და კომპიუტერისთვისაც საკმაოდ გამოსადეგი. ჩვენი გუნდის მიერ შექმნილი მანქანური აღქმის ალგორითმების მაგალითია, როცა Google-ის ფოტოებში სურათების მოძებნა, მასზე არსებული გამოსახულებითაა შესაძლებელი. აღქმის საპირისპირო მხარე შემოქმედებაა. როცა წარმოდგენას რეალობად აქცევთ. გასულ წელს, მანქანურ აღქმაზე მუშაობამ, მოულოდნელად მანქანურ შემოქმედებასთან და მანქანურ ხელოვნებასთან დაგვაკავშირა.

I think Michelangelo had a penetrating insight into to this dual relationship between perception and creativity. This is a famous quote of his: "Every block of stone has a statue inside of it, and the job of the sculptor is to discover it." So I think that what Michelangelo was getting at is that we create by perceiving, and that perception itself is an act of imagination and is the stuff of creativity.

ვფიქრობ, მიქელანჯელოს აზრი ორმხრივ დამოკიდებულებაზე აღქმასა და შემოქმედებას შორის, მართლაც შორსმჭვრეტელი იყო. მისი ეს ცნობილი ციტატა შემდეგია: "ქვის ყოველი ბლოკის შიგნით ქანდაკებაა და მოქანდაკის ამოცანაა ის აღმოაჩინოს" მგონი, მიქელანჯელო იმას გულისხმობდა, რომ ჩვენ აღქმის წყალობით ვქმნით და აღქმა თვითონაა წარმოსახვის აქტი და მასალა შემოქმედებისთვის.

The organ that does all the thinking and perceiving and imagining, of course, is the brain. And I'd like to begin with a brief bit of history about what we know about brains. Because unlike, say, the heart or the intestines, you really can't say very much about a brain by just looking at it, at least with the naked eye. The early anatomists who looked at brains gave the superficial structures of this thing all kinds of fanciful names, like hippocampus, meaning "little shrimp." But of course that sort of thing doesn't tell us very much about what's actually going on inside.

ორგანო, რომელიც ამას ყველაფერს ფიქრობს, აღიქვამს და წარმოიდგენს რა თქმა უნდა ტვინია. მინდა ტვინზე ჩვენი ცოდნის მოკლე ისტორიული მიმოხილვით დავიწყო. გულისგან და ნაწლავებისგან განსხვავებით, ტვინს თუ შეხედავთ, მასზე ბევრს ვერაფერს იტყვით, ყოველ შემთხვევაში შეუიარაღებელი თვალით. ადრეული ანატომები, რომლებიც ტვინს აკვირდებოდნენ მის გარე სტრუქტურას სხვადასხვა უცნაურ სახელებს არქმევდნენ, მაგალითად, როგორც ჰიპოკამპი, რაც "პატარა კრევეტს" ნიშნავს თუმცა, რა თქმა უნდა, ეს ყველაფერი, ბევრს არაფერს გვეუბნება იმაზე, თუ რა ხდება სინამდვილეში შიგნით.

The first person who, I think, really developed some kind of insight into what was going on in the brain was the great Spanish neuroanatomist, Santiago Ramón y Cajal, in the 19th century, who used microscopy and special stains that could selectively fill in or render in very high contrast the individual cells in the brain, in order to start to understand their morphologies. And these are the kinds of drawings that he made of neurons in the 19th century.

ვფიქრობ, პირველი, ვინც რეალურად შეიქმნა რაღაც წარმოდგენა მაინც, თუ რა ხდებოდა ტვინში, დიდი ესპანელი ნეიროანატომი სანტიაგო რამონ ი კახალი იყო, მე-19 საუკუნეში. მან მიკროსკოპი და სპეციალური საღებავი გამოიყენა, რომლითაც ტვინის ცალკეული უჯრედები შერჩევითად შეავსო და კონტრასტული გახადა, რითიც მათი მორფოლოგია გაიგო. ესაა ნეირონების ის გამოსახულებები, რომლებიც მან მე-19 საუკუნეში დახატა.

This is from a bird brain. And you see this incredible variety of different sorts of cells, even the cellular theory itself was quite new at this point. And these structures, these cells that have these arborizations, these branches that can go very, very long distances -- this was very novel at the time. They're reminiscent, of course, of wires. That might have been obvious to some people in the 19th century; the revolutions of wiring and electricity were just getting underway. But in many ways, these microanatomical drawings of Ramón y Cajal's, like this one, they're still in some ways unsurpassed.

ეს ჩიტის ტვინია. აქ თქვენ უჯრედების არაჩვეულებრივ მრავალფეროვნებას ხედავთ, ამ დროს უჯრედული თეორიაც კი, საკმაოდ ახალი ხილი იყო. მითუმეტეს, უჯრედთა სტრუქტურები უჯრედთა ეს განშტოებები და მათი ეს განტოტვა, რომელიც ძალიან შორს მიდის, ეს ძალიან ახალი იყო. ისინი რა თქმა უნდა სადენებს გაგონებთ, რაც შეიძლება უკვე ნაცნობი იყო ზოგისთვის მე-19 საუკუნეში. ელექტროფიკაცია სწორედ ამ დროს იწყებოდა. თუმცა, დიდი ანგარიშით, რამონ ი კახალის ეს მიკრონატომიური გამოსახულებები რაღაც მხრივ დღესაც შეუდარებელია.

We're still more than a century later, trying to finish the job that Ramón y Cajal started. These are raw data from our collaborators at the Max Planck Institute of Neuroscience. And what our collaborators have done is to image little pieces of brain tissue. The entire sample here is about one cubic millimeter in size, and I'm showing you a very, very small piece of it here. That bar on the left is about one micron. The structures you see are mitochondria that are the size of bacteria. And these are consecutive slices through this very, very tiny block of tissue. Just for comparison's sake, the diameter of an average strand of hair is about 100 microns. So we're looking at something much, much smaller than a single strand of hair.

ერთი საუკუნის შემდეგ, ჩვენ ჯერ კიდევ ვცდილობთ რამონ ი კახალის მიერ დაწყებული საქმის დასრულებას. ეს ჩვენი კოლეგების მიერ მოპოვებული დაუმუშავებელი მონაცემებია, მაქს პლანკის ნეირომეცნიერების ინსტიტუტიდან. ჩვენმა კოლეგებმა, ტვინის ქსოვილის პატარა ნაჭრები გამოსახეს. ეს მთელი ნიმუში დაახლოებით 1 კუბური მილიმეტრია და მე აქ ძალიან პატარა ნაჭერს გაჩვენებთ. მარცხნივ, ეს ფირფიტა დაახლოებით 1 მიკრონია. სტრუქტურები, რომელსაც ხედავთ ბაქტერიის ზომის მიტოქონდრიებია. ეს ქსოვილის ძალიან მცირე ბლოკის მიმდევრობითი ჩამონაჭრებია. შედარებისთვის, თმის ერთი ღერის საშუალო დიამეტრი 100 მიკრონია. ანუ აქ ჩვენ ვუყურებთ თმის ღერზე ბევრად პატარა რამეს.

And from these kinds of serial electron microscopy slices, one can start to make reconstructions in 3D of neurons that look like these. So these are sort of in the same style as Ramón y Cajal. Only a few neurons lit up, because otherwise we wouldn't be able to see anything here. It would be so crowded, so full of structure, of wiring all connecting one neuron to another.

ამ მიკროსკოპული ჩამონაჭრების სერიიდან შეიძლება 3 განზომილებიანი ნეირონის მსგავსი რეკონსტრუქციების აწყობა. სტილით ისინი, რამონ ი კახალის ნახატებს ჰგავს. მხოლოდ ცალკეული ნეირონები ნათდება, სხვაგვარად ვერაფერს დავინახავდით. ისეთი გადაჭედილი იქნებოდა, სავსე სტრუქტურებითა და გაყვანილობებით, რომლებიც ერთ ნეირონს მეორესთან აერთებს

So Ramón y Cajal was a little bit ahead of his time, and progress on understanding the brain proceeded slowly over the next few decades. But we knew that neurons used electricity, and by World War II, our technology was advanced enough to start doing real electrical experiments on live neurons to better understand how they worked. This was the very same time when computers were being invented, very much based on the idea of modeling the brain -- of "intelligent machinery," as Alan Turing called it, one of the fathers of computer science.

მაშ, რამონ ი კახალი დროს ცოტათი უსწრებდა და ტვინის გაგება ნელა პროგრესირებდა შემდეგი რამდენიმე ათწლეულის მანძილზე. თუმცა, ჩვენ ვიცოდით, რომ ნეირონები ელექტრობას იყენებდნენ და მეორე მსოფლიო ომისთვის, ტექნოლოგია საკმარისად განვითარებული იყო იმისთვის, რომ ცოცხალ ნეირონებზე ნამდვილი ელექტრო ექსპერიმენტები ჩატარებულიყო იმისთვის რომ უკეთ შეგვესწავლა, როგორ მუშაობენ ისინი. ეს ზუსტად ის დროა, როცა კომპიუტერები გამოიგონეს, სწორედ ტვინის მოდელირებაზე, ე.წ. "გონიერ მანქანაზე" დაფუძნებით, როგორც მას კომპიუტერული მეცნიერების ერთ-ერთმა მამამ, ალან ტიურინგმა უწოდა.

Warren McCulloch and Walter Pitts looked at Ramón y Cajal's drawing of visual cortex, which I'm showing here. This is the cortex that processes imagery that comes from the eye. And for them, this looked like a circuit diagram. So there are a lot of details in McCulloch and Pitts's circuit diagram that are not quite right. But this basic idea that visual cortex works like a series of computational elements that pass information one to the next in a cascade, is essentially correct.

უორენ მაკკალოკმა და უოლტერ პიტსმა შეხედეს რამონ ი კახალის ნახატებს, რომელზეც მხედველობის ქერქი იყო ახლა სწორედ ამას ხედავთ. ტვინის ეს ქერქი თვალებიდან შემოსულ გამოსახულებებს ამუშავებს მათთვის ეს შეკრული წრედის დიაგრამასავით იყო. მაკკალოკმა და პიტსის წრედის დიაგრამაში ბევრი დეტალი მთლად ზუსტი არ არის, თუმცა ძირითადი იდეა, რომ მხედველობის ქერქი მუშაობს, როგორც გამოთვლითი ელემენტების სერია, რომლებიც ერთმანეთს კასკადურად გადასცემენ ინფორმაციას, არსებითად სწორია.

Let's talk for a moment about what a model for processing visual information would need to do. The basic task of perception is to take an image like this one and say, "That's a bird," which is a very simple thing for us to do with our brains. But you should all understand that for a computer, this was pretty much impossible just a few years ago. The classical computing paradigm is not one in which this task is easy to do.

მოდი, ერთი წუთით ვთქვათ რა უნდა გააკეთოს ვიზუალური ინფორმაციის დამუშავების მოდელმა აღქმის ძირითადი ამოცანაა, აიღოს მსგავსი გამოსახულება და თქვას "ეს ჩიტია" რაც ჩვენთვის ძალიან ადვილია ტვინის გამოყენებით. თუმცა, ყველას უნდა გესმოდეთ, რომ კომპიუტერისთვის სულ რაღაც რამდენიმე წლის წინ ეს პრაქტიკულად შეუძლებელი იყო. კომპიუტერის კლასიკურ პარადიგმაში მსგავსი რამის გაკეთება მარტივი არ არის.

So what's going on between the pixels, between the image of the bird and the word "bird," is essentially a set of neurons connected to each other in a neural network, as I'm diagramming here. This neural network could be biological, inside our visual cortices, or, nowadays, we start to have the capability to model such neural networks on the computer. And I'll show you what that actually looks like.

მაშ, რაც ხდება პიქსელებს, ჩიტის გამოსახულებასა და სიტყვა "ჩიტს" შორის, არსებითად ნეირონების სიმრავლეა, რომლებიც ერთმანეთს ნეირონულ ქსელში უკავშირდება, როგორც დიაგრამაზე ხედავთ. ეს ნეირონული ქსელი შეიძლება იყოს ბიოლოგიური, ჩვენ მხედველობის ქერქში, ან დღესდღეობით, ჩვენ გვაქვს საშუალება ასეთი ნეირონული ქსელების მოდელი კომპიუტერში შევქმნათ. გაჩვენებთ სინამდვილეში ეს როგორ გამოიყურება.

So the pixels you can think about as a first layer of neurons, and that's, in fact, how it works in the eye -- that's the neurons in the retina. And those feed forward into one layer after another layer, after another layer of neurons, all connected by synapses of different weights. The behavior of this network is characterized by the strengths of all of those synapses. Those characterize the computational properties of this network. And at the end of the day, you have a neuron or a small group of neurons that light up, saying, "bird."

პიქსელები ნეირონების პირველ შრედ შეიძლება წარმოვიდგინოთ ფაქტიურად ასე მუშაობს თვალიც... ეს არის ნეირონები რეტინაში. ისინი აწვდიან ინფორმაციას ნეირონების ზედა შრეს და შემდეგ შრეებს, ერთი მეორის მიყოლებით, ისინი ყველა ერთმანეთს სხვადასხვა წონის სინაფსებით უკავშირდება. ამ ქსელის ქცევა, სინაფსების ძალებით ხასიათდება. ისინი ქსელის გამოთვლით თვისებებს ახასიათებენ. და საბოლოოდ ვიღებთ ნეირონს ან ნეირონების მცირე ჯგუფს, რომლებიც ნათდებიან და ამბობენ "ჩიტი"

Now I'm going to represent those three things -- the input pixels and the synapses in the neural network, and bird, the output -- by three variables: x, w and y. There are maybe a million or so x's -- a million pixels in that image. There are billions or trillions of w's, which represent the weights of all these synapses in the neural network. And there's a very small number of y's, of outputs that that network has. "Bird" is only four letters, right? So let's pretend that this is just a simple formula, x "x" w = y. I'm putting the times in scare quotes because what's really going on there, of course, is a very complicated series of mathematical operations.

ვაპრებ შემდეგი სამი რამ: შემავალი პიქსელები, სინაფსები ნეირონული ქსელში და შედეგი - ჩიტი, სამ ცვლადად წარმოგიდგინოთ: x, w და y. გვაქვს სადღაც ალბათ მილიონი x. მილიონი პიქსელი ამ გამოსახულებაში. არსებობს მილიარდობით, ან ტრილიონობით w, რაც ნეირონულ ქსელში თითოეული სინაფსის წონას წარმოადგენს. და y-ები ძალიან მცირე რაოდენობით. რაც ქსელიდან გამომავალი შედეგებია. "ჩიტი" მხოლოდ ოთხი ასოსგან შედგება. წარმოვიდგინოთ შემდეგი მარტივი ფორმულა: x "x" w = y. გამრავლების ნიშანი ბრჭყალებში ჩავსვი, რადგან, რა თქმა უნდა, სინამდვილეში ამ ადგილას, ძალიან რთული მათემატიკური ოპერაციების წყებაა.

That's one equation. There are three variables. And we all know that if you have one equation, you can solve one variable by knowing the other two things. So the problem of inference, that is, figuring out that the picture of a bird is a bird, is this one: it's where y is the unknown and w and x are known. You know the neural network, you know the pixels. As you can see, that's actually a relatively straightforward problem. You multiply two times three and you're done. I'll show you an artificial neural network that we've built recently, doing exactly that.

ეს ერთი განტოლებაა და სამი ცვლადი. ჩვენ ვიცით, რომ თუ გვაქვს ერთი განტოლება, შეგვძილია ის ერთი ცვლადის მიმართ ამოვხსნათ თუ დანარჩენი ორი ცნობილია. მაშ, ამოცნობის პრობლემა, ანუ, იმის დადგენა, რომ ჩიტის სურათზე ჩიტია გამოსახული შემდეგზე დადის: y უცნობია და w და x კი - ცნობილი. ვიცით ნეირონული ქსელი და ვიცით პიქსელები. როგორც ხედავთ, ეს შედარებით სწორხაზოვანი ამოცანაა. ორს სამზე გაამრავლებთ და მორჩა. მე გაჩვენებთ ხელოვნურ ნეირონულ ქსელს, რომელიც ცოტა ხნის წინ, ზუსტად ასე ავაგეთ.

This is running in real time on a mobile phone, and that's, of course, amazing in its own right, that mobile phones can do so many billions and trillions of operations per second. What you're looking at is a phone looking at one after another picture of a bird, and actually not only saying, "Yes, it's a bird," but identifying the species of bird with a network of this sort. So in that picture, the x and the w are known, and the y is the unknown. I'm glossing over the very difficult part, of course, which is how on earth do we figure out the w, the brain that can do such a thing? How would we ever learn such a model?

ის მობილურ ტელეფონზე მუშა რეჟიმშია და რა თქმა უნდა, თავისთავად საოცრებაა, რომ მობილურ ტელეფონებს ამდენი მილიარდობით და ტრილიონობით ოპერაციების შესრულება შეუძლიათ წამში. ჩვენ ვხედავთ ტელეფონს, რომელიც უყურებს ჩიტის სურათებს ერთი მეორის მიყოლებით და არა მხოლოდ ამბობს: "დიახ, ეს ჩიტია", არამედ ამგვარი ქსელის გამოყენებით, ადგენს მის სახეობას. მაშ, ამ სურათში, x და w ცნობილია, y - უცნობი. მე, რა თქმა უნდა, ვტოვებ იმ ურთულეს ნაწილს, თუ როგორ ვახერხებთ w-ს გაგებას, ტვინისას, რომელსაც ამის გაკეთება შეუძლია? როგორ დავადგინოთ ეს მოდელი?

So this process of learning, of solving for w, if we were doing this with the simple equation in which we think about these as numbers, we know exactly how to do that: 6 = 2 x w, well, we divide by two and we're done. The problem is with this operator. So, division -- we've used division because it's the inverse to multiplication, but as I've just said, the multiplication is a bit of a lie here. This is a very, very complicated, very non-linear operation; it has no inverse. So we have to figure out a way to solve the equation without a division operator. And the way to do that is fairly straightforward. You just say, let's play a little algebra trick, and move the six over to the right-hand side of the equation. Now, we're still using multiplication. And that zero -- let's think about it as an error. In other words, if we've solved for w the right way, then the error will be zero. And if we haven't gotten it quite right, the error will be greater than zero.

w-ს გაგების ეს პროცესი თუ ამას მარტივი განტოლებიდან ვაკეთებთ, რომელშიც ცვლადებს რიცხვებად წარმოვიდგენთ, ზუსტად ვიცით ეს როგორ გავაკეთოთ: 6 = 2 x w, ვყოფთ 2-ზე და დამთავრდა. პრობლემა სწორედ ამ ოპერაციაშია, გაყოფაში... ჩვენ ვიყენებთ გაყოფას, რადგან ის გამრავლების საპირისპიროა, მაგრამ როგორც გითხარით, ეს მთლად გამრავლება არ არის. ეს ურთულესი, ძალიან არაწრფივი ოპერაციაა, რომელსაც საპირისპირო არ გააჩნია. ამიტომ, ამ განტოლების ამოხსნა გაყოფის ოპერაციის გარეშე უნდა მოვახერხოთ. ამის გაკეთება კი, საკმაოდ მარტივად შეიძლება. უბრალოდ, პატარა ალგებრული ხრიკი ვიხმაროთ და 6-იანი განტოლების მარჯვენა მხარეს გადავიტანოთ. ახლა, ისევ გამრავლება გვაქვს და ეს ნული.... მოდი ის "ცდომილებად" წარმოვიდგინოთ. სხვა სიტყვებით, w-ის მიმართ სწორად თუ ამოვხსნით, მაშინ ცდომილება 0 იქნება. ხოლო, თუ შევცდებით, ცდომილება 0-ზე მეტი უნდა იყოს.

So now we can just take guesses to minimize the error, and that's the sort of thing computers are very good at. So you've taken an initial guess: what if w = 0? Well, then the error is 6. What if w = 1? The error is 4. And then the computer can sort of play Marco Polo, and drive down the error close to zero. As it does that, it's getting successive approximations to w. Typically, it never quite gets there, but after about a dozen steps, we're up to w = 2.999, which is close enough. And this is the learning process.

მაშ, ახლა უკვე ვარაუდით, შეგვიძლია ცდომილება მინიმუმზე დავიყვანოთ. სწორედ ამაში არიან კომპიუტერები ძალიან ძლიერები. მაშ, პირველადი ვარაუდი: იქნებ w = 0? მაშინ ცდომილება იქნება 6. ახლა ვცადოთ w = 1? ცდომილებაა 4. შემდეგ კომპიუტერს შეუძლია "გამოცნობა" ითამაშოს და ცდომილება 0-ს მიუახლოვოს. ამით ის w-ს მიმდევრობით მიახლოებებს იგებს. როგორც წესი, ზუსტად ვერასდროს გაიგებს, მაგრამ ათეული ბიჯის შემდეგ, ჩვენ ვიღებთ w = 2.999, რაც საკმარისი მიახლოებაა. ესაა შესწავლის პროცესი.

So remember that what's been going on here is that we've been taking a lot of known x's and known y's and solving for the w in the middle through an iterative process. It's exactly the same way that we do our own learning. We have many, many images as babies and we get told, "This is a bird; this is not a bird." And over time, through iteration, we solve for w, we solve for those neural connections.

გაგახსენებთ რას ვაკეთებთ. ვიღებთ უამრავ ცნობილ x-ს და y-ს და იტერაციული პროცესის გამოყენებით ვიგებთ w-ს. ზუსტად ასე ვსწავლობთ ჩვენც. ბავშვობაში უამრავ გამოსახულებას ვხედავთ და გვეუბნებიან: "ესაა ჩიტი, ეს არაა ჩიტი" და დროთა განმავლობაში, იტერაციით, ჩვენ ვიგებთ w-ს და ვაგებთ ნეიონულ კავშირებს.

So now, we've held x and w fixed to solve for y; that's everyday, fast perception. We figure out how we can solve for w, that's learning, which is a lot harder, because we need to do error minimization, using a lot of training examples.

მაშ, ახლა ვიცით x და w და შეგვიძლია y-სთვის ამოვხსნათ; ეს ყოველდღიური სწრაფი აღქმაა. ჩვენ ვარკვევთ როგორ გავიგოთ w, ეს შესწავლაა, რაც ბევრად უფრო რთულია, რადგან უამრავ მაგალითზე წვრთნის გამოყენებით, ცდომილების მინიმიზაცია გვიწევს.

And about a year ago, Alex Mordvintsev, on our team, decided to experiment with what happens if we try solving for x, given a known w and a known y. In other words, you know that it's a bird, and you already have your neural network that you've trained on birds, but what is the picture of a bird? It turns out that by using exactly the same error-minimization procedure, one can do that with the network trained to recognize birds, and the result turns out to be ... a picture of birds. So this is a picture of birds generated entirely by a neural network that was trained to recognize birds, just by solving for x rather than solving for y, and doing that iteratively.

დაახლოებით 1 წლის წინ, ჩვენი გუნდის წევრმა, ალექს მორდვინცევმა, გადაწყვიტა ჩაეტარებინა განტოლების x-ის მიმართ ამოხსნის ექსპერიმენტი, მაშინ როცა w და y ცნობილია. სხვა სიტყვებით, ვიცით, რომ ეს ჩიტია და გვაქვს ნეირონული ქსელი, რომელსაც ჩიტების ამოცნობა შეუძლია, მაგრამ როგორი იქნება ჩიტის გამოსახულება? აღმოჩნდა, რომ ზუსტად ისეთივე ცდომილების შემამცირებელი პროცედურის გამოყენებით შეგვიძლია ეს ჩიტების ამოცნობაზე გაწვრთნილ ნეირონულ ქსელს გავუკეთოთ და შედეგი... ჩიტების გამოსახულებაა. მაშ, ჩიტების ეს გამოსახულება, მთლიანად ისეთი ნეირონული ქსელის მიერაა შექმნილი, რომლებიც ჩიტების ამოცნობაზეა გაწვრთნილი. ეს შესაძლებელი გახდა მხოლოდ y-ის ნაცვლად x მიმართ იტერაციული ამოხსნით.

Here's another fun example. This was a work made by Mike Tyka in our group, which he calls "Animal Parade." It reminds me a little bit of William Kentridge's artworks, in which he makes sketches, rubs them out, makes sketches, rubs them out, and creates a movie this way. In this case, what Mike is doing is varying y over the space of different animals, in a network designed to recognize and distinguish different animals from each other. And you get this strange, Escher-like morph from one animal to another.

აი, კიდევ ერთი სახალისო მაგალითი. ეს ჩვენი ჯგუფის წევრის, მაიკ ტაიკას გაკეთებულია. მან ამას "ცხოველების აღლუმი" დაარქვა. ეს ცოტათი უილიამ კენტრიჯის შემოქმედებას მაგონებს, სადაც ის ესკიზებს აკეთებს, შემდეგ შლის, შემდეგ ისევ ხატავს, შემდეგ შლის და ასე ქმნის ფილმს. ამ შემთხვევაში მაიკი ცვლის y-ს სხვადასხვა ცხოველების სივრცეზე ქსელში, რომელიც სხვადასხვა ცხოველების ამოსაცნობად და გასარჩევადაა შექმნილი. შედეგად იღებთ, რაღაც ეშერის სტილში, მორფულ გადასვლებს ცხოველებს შორის.

Here he and Alex together have tried reducing the y's to a space of only two dimensions, thereby making a map out of the space of all things recognized by this network. Doing this kind of synthesis or generation of imagery over that entire surface, varying y over the surface, you make a kind of map -- a visual map of all the things the network knows how to recognize. The animals are all here; "armadillo" is right in that spot.

აქ, მან და ალექსმა ერთად სცადეს შეემცირებინათ y-ების სიმრავლე მხოლოდ ორგანზომილებიან სივრცეზე რითიც მიიღეს, ამ ქსელის მიერ ყველა ამოცნობადი ობიექტისგან შემდგარი სივრცის რუკა. მსგავსი სინთეზის გაკეთებისას, ან გამოსახულებების გენერირებით მთელ ზედაპირზე, y-ის ცვლილებით ზედაპირზე, თქვენ ქმნით გარკვეულ რუკას. იმ ყველაფრის ვიზუალურ რუკას, რისი ამოცნობაც ქსელს შეუძლია. ყველა ცხოველი აქაა; "ჯავშნოსანი" ზუსტად ამ წერტილშია.

You can do this with other kinds of networks as well. This is a network designed to recognize faces, to distinguish one face from another. And here, we're putting in a y that says, "me," my own face parameters. And when this thing solves for x, it generates this rather crazy, kind of cubist, surreal, psychedelic picture of me from multiple points of view at once. The reason it looks like multiple points of view at once is because that network is designed to get rid of the ambiguity of a face being in one pose or another pose, being looked at with one kind of lighting, another kind of lighting. So when you do this sort of reconstruction, if you don't use some sort of guide image or guide statistics, then you'll get a sort of confusion of different points of view, because it's ambiguous. This is what happens if Alex uses his own face as a guide image during that optimization process to reconstruct my own face. So you can see it's not perfect. There's still quite a lot of work to do on how we optimize that optimization process. But you start to get something more like a coherent face, rendered using my own face as a guide.

ამის გაკეთება, სხვა სახის ქსელებშიც შეგიძლიათ. ეს სახეების ამოსაცნობად შექმნილი ქსელია, რომელიც ერთ სახეს მეორისგან ასხვავებს. და აქ ვამატებთ y, რომელიც არის "მე", ჩემი სახის პარამეტრები. როცა ამას x მიმართ ვხსნით, ვიღებთ ჩემს საკმაოდ გიჟურ, კუბისტურ, სურეალისტურ, ფსიქოდელიურ სურათს, სხვადასხვა კუთხიდან ერთდროულად. ის ერთდროულად სხვადასხვა კუთხიდან დანახულს იმიტომ ჰგავს, რომ ეს ქსელი ცდილობს გათავისუფლდეს იმ გაურკვევლობისგან, რომელიც სახის სხვადასხვა მდგომარეობაში, ან სხვადასხვა განათების პირობებში ყოფნას ახლავს. ამიტომ, როცა მსგავს რეკონსტრუქციას აკეთებთ, თუ საფუძვლად არ გამოიყენებთ რაღაც ტიპის სურათს, ან სტატისტიკას, მიიღებთ სხვადასხვა თვალთახედვის აღრევას, რადგან ადგილი აქვს გაურკვევლობას. აი, რა მოხდება თუ ალექსი ჩემი სახის რეკონსტრუქციისთვის, ოპტიმიზაციის პროცესში საფუძვლად საკუთარ სახეს გამოიყენებს. როგორც ხედავთ იდეალური არაა. კიდევ ბევრი სამუშაოა ჩასატარებელი თუ როგორ მოვახდინოთ ოპტიმიზაციის პროცესის ოპტიმიზება. თუმცა, უკვე ვიღებთ, რაღაც უფრო გამოკვეთილი სახის მსგავსს, როცა საფუძვლად ჩემს სახეს ვიყენებთ.

You don't have to start with a blank canvas or with white noise. When you're solving for x, you can begin with an x, that is itself already some other image. That's what this little demonstration is. This is a network that is designed to categorize all sorts of different objects -- man-made structures, animals ... Here we're starting with just a picture of clouds, and as we optimize, basically, this network is figuring out what it sees in the clouds. And the more time you spend looking at this, the more things you also will see in the clouds. You could also use the face network to hallucinate into this, and you get some pretty crazy stuff.

არ არის აუცილებელი სუფთა ფურცლიდან, ან თეთრი ხმაურიდან დაიწყოთ, როცა x-ის მიმართ ხსნით. შეგიძლიათ დაიწყოთ x-ით, რომელიც თავისთავად რაღაც გამოსახულებაა. ეს არის ამის პატარა დემონსტრირება. ეს არის ქსელი, რომელიც შექმნილია სხვადასხვა ობიექტების, ხელოვნური სტრუქტურების, ცხოველების კატეგორიზებისთვის ვიწყებთ მხოლოდ ღრუბლების გამოსახულებით და ოპტიმიზაციასთან ერთად, ქსელი არკვევს, თუ რას ხედავს ის ღრუბლებში. რაც უფრო დიდხანს უყურებთ, თქვენც მით უფრო მეტ რამეს დაინახავთ ღრუბლებში. თქვენ ასევე შეგიძლიათ ჰალუცინაციებისთვის, სახის ამომცნობი ქსელი გამოიყენოთ და საკმაოდ გიჟურ რამეებს მიიღებთ.

(Laughter)

(სიცილი)

Or, Mike has done some other experiments in which he takes that cloud image, hallucinates, zooms, hallucinates, zooms hallucinates, zooms. And in this way, you can get a sort of fugue state of the network, I suppose, or a sort of free association, in which the network is eating its own tail. So every image is now the basis for, "What do I think I see next? What do I think I see next? What do I think I see next?"

მაიკმა კიდევ სხვა ექსპერიმენტებიც ჩაატარა, რომლებშიც ის იღებს ღრუბლების გამოსახულებას, ჰალუცინირებს, აახლოვებს, ჰალუცინირებს, აახლოვებსს, ჰალუცინირებს, აახლოვებს. და ამგვარად, შეგიძლიათ მიიღოთ ქსელის დისოციაციური მდგომარეობის მაგვარი, ან რაღაც თავისუფალი ასოციაციების მაგვარი, რომელშიც ქსელი საკუთარ კუდს ჭამს. ანუ, ყოველი გამოსახულება საფუძვლად უდევს: "რას დავინახავ შემდეგ? რას დავინახავ შემდეგ? რას დავინახავ შემდეგ?"

I showed this for the first time in public to a group at a lecture in Seattle called "Higher Education" -- this was right after marijuana was legalized.

პირველად ეს საჯაროდ სიეტლში ვაჩვენე, ლექციაზე სახელად "უმაღლესი განათლება"... სწორედ მარიხუანას ლეგაიზაციის შემდეგ.

(Laughter)

(სიცილი)

So I'd like to finish up quickly by just noting that this technology is not constrained. I've shown you purely visual examples because they're really fun to look at. It's not a purely visual technology. Our artist collaborator, Ross Goodwin, has done experiments involving a camera that takes a picture, and then a computer in his backpack writes a poem using neural networks, based on the contents of the image. And that poetry neural network has been trained on a large corpus of 20th-century poetry. And the poetry is, you know, I think, kind of not bad, actually.

მინდა სწრაფათ დავასრულო იმით, რომ ეს ტექნოლოგია შეუზღუდავია. მე მხოლოდ ვიზუალური მაგალითები გაჩვენეთ, იმიტომ რომ ისინი სახალისო სანახავია. ეს არაა მხოლოდ ვიზუალური ტექნოლოგია. ჩვენმა თანამშრომელმა, მხატვარმა, როს გუდუინმა, ექსპერიმენტები კამერის გამოყენებით ჩაატარა და მის ზურჩანთაში მყოფი კომპიუტერი ნეირონული ქსელის გამოყენებით ლექსებს წერს, მის მიერ გადაღებული სურათების შითავისის საფუძველზე. ეს პოეტური ნეირონული ქსელი მე-20 საუკუნის პოეზიის დიდ კრებულზეა გაწვრთნილი. და ეს პოეზია, ვფიქრობ, არც ისე ცუდია, პრინციპში.

(Laughter)

(სიცილი)

In closing, I think that per Michelangelo, I think he was right; perception and creativity are very intimately connected. What we've just seen are neural networks that are entirely trained to discriminate, or to recognize different things in the world, able to be run in reverse, to generate. One of the things that suggests to me is not only that Michelangelo really did see the sculpture in the blocks of stone, but that any creature, any being, any alien that is able to do perceptual acts of that sort is also able to create because it's exactly the same machinery that's used in both cases.

და ბოლოს, ვფიქრობ მიქელანჯელო მართალი იყო; აღქმა და შემოქმედება, ძალიან მჭიდროდაა დაკავშირებული. ახლა ჩვენ ვნახეთ ნეირონული ქსელები, რომლებიც გაწვრთნილები არიან გაარჩიონ და ამოიცნონ სხვადასხვა ობიექტები მსოფლიოში. მათ შეუძლიათ უკუღმა გაეშვან და შექმნან. ერთ-ერთი რასაც ეს მაჩვენებს არა მხოლოდ ისაა, რომ მიქელანჯელო მართლაც ხედავდა ქანდაკებას ქვის ბლოკებში, არამედ, რომ ნებისმიერი ქმნილება, ნებისმიერი არსება, უცხოპლანეტელი, რომელსაც მსგავსი აღქმის უნარი აქვს, შეუძლია შექმნას კიდეც, რადგან ორივე შემთხვევაში ზუსტად ერთნაირი მექანიზმი გამოიყენება.

Also, I think that perception and creativity are by no means uniquely human. We start to have computer models that can do exactly these sorts of things. And that ought to be unsurprising; the brain is computational.

ასევე ვფიქრობ, რომ აღქმა და შემოქმედება სულაც არ არის უნიკალურად ადამიანური. ჩვენ უკვე გვაქვს კოპიუტერული მოდელები, რომლებსაც მსგავსი რამეების კეთება შეუძლიათ და გასაკვირი არც უნდა იყოს; ტვინიც ხომ ერთგვარი კომპიუტერია

And finally, computing began as an exercise in designing intelligent machinery. It was very much modeled after the idea of how could we make machines intelligent. And we finally are starting to fulfill now some of the promises of those early pioneers, of Turing and von Neumann and McCulloch and Pitts. And I think that computing is not just about accounting or playing Candy Crush or something. From the beginning, we modeled them after our minds. And they give us both the ability to understand our own minds better and to extend them.

და ბოლოს, კომპიუტერების შექმნა გონიერი მანქანების შექმნის მცდელობად დაიწყო. მისი შემუშავება დიდწილად განსაზღვრა იდეამ, თუ როგორ შეგვიძლია მანქანები გახვადოთ გონიერი. და ახლა საბოლოოდ, ვიწყებთ ამ საქმის პიონერების, ტიურინგის და ვონ ნოიმანის, მაკკალოკის და პიტსის ზოგიერთი დანაპირების შესრულებას. ვფიქრობ კომპიუტერები არა მხოლოდ გამოთვლაა, ან Candy Crush-ის, ან რამე მსგავსის თამაში. ჩვენ ისინი თავიდანვე ჩვენი ტვინის მიხედვით დავაპროექტეთ. ისინი საშუალებას გვაძლევს როგორც ჩვენი ტვინი გავიგოთ უკეთ, ასევე გავაუმჯობესოთ ის.

Thank you very much.

დიდი მადლობა.

(Applause)

(აპლოდისმენტები)

(Laughter)

(სიცილი)

I showed this for the first time in public to a group at a lecture in Seattle called "Higher Education" -- this was right after marijuana was legalized.

(Laughter)

(სიცილი)