Jeremy Howard: The wonderful and terrifying implications of computers that can learn

Wenn man früher wollte, dass ein Computer etwas Neues tat, musste man ihn programmieren. Für alle, die es noch nie selbst probiert haben: Beim Programmieren muss man bis ins kleinste Detail jeden einzelnen Schritt definieren, den der Computer erledigen soll, um sein Ziel zu erreichen. Will man also etwas tun, was man selbst noch nicht kann, dann wird das eine große Herausforderung.

It used to be that if you wanted to get a computer to do something new, you would have to program it. Now, programming, for those of you here that haven't done it yourself, requires laying out in excruciating detail every single step that you want the computer to do in order to achieve your goal. Now, if you want to do something that you don't know how to do yourself, then this is going to be a great challenge.

Dieser Herausforderung stellte sich dieser Mann, Arthur Samuel. 1956 wollte er diesem Computer beibringen, ihn im Spiel Dame zu schlagen. Wie kann man ein Programm schreiben und bis ins kleinste Detail definieren, wie man sich selbst in Dame übertrifft? Also hatte er eine Idee: Er ließ den Computer tausende Male gegen sich selbst spielen, sodass er Dame spielen lernte. Das funktionierte wirklich, und schon 1962 besiegte dieser Computer den Landesmeister von Connecticut.

So this was the challenge faced by this man, Arthur Samuel. In 1956, he wanted to get this computer to be able to beat him at checkers. How can you write a program, lay out in excruciating detail, how to be better than you at checkers? So he came up with an idea: he had the computer play against itself thousands of times and learn how to play checkers. And indeed it worked, and in fact, by 1962, this computer had beaten the Connecticut state champion.

Arthur Samuel war also der Urvater des Maschinellen Lernens und ich schulde ihm viel, denn ich bin ein Fachmann im Maschinellen Lernen. Ich war Präsident von Kaggle, einer Plattform von über 200 000 Fachleuten für Maschinelles Lernen. Kaggle veranstaltet Wettbewerbe, bei denen bisher ungelöste Probleme gelöst werden sollen, und das war schon hunderte Male erfolgreich. Aus dieser Warte habe ich viel darüber gelernt, was Maschinelles Lernen früher konnte, was es heute kann und was es zukünftig vollbringen könnte. Der vielleicht erste kommerzielle Erfolg im Maschinellen Lernen war Google. Google hat bewiesen, dass man Informationen über einen Computeralgorithmus finden kann, der auf Maschinellem Lernen basiert. Seitdem gab es viele kommerzielle Erfolge im Maschinellen Lernen. Firmen wie Amazon oder Netflix nutzen Maschinelles Lernen für Kaufempfehlungen oder Filmvorschläge. Manchmal ist das beinahe gruselig. Firmen wie LinkedIn oder Facebook schlagen Ihnen manchmal neue Freunde vor und Sie haben keine Ahnung, wie das geht, und genau das ist die Macht des Maschinellen Lernens. Diese Algorithmen haben anhand vorhandener Daten gelernt, anstatt von Hand programmiert zu werden.

So Arthur Samuel was the father of machine learning, and I have a great debt to him, because I am a machine learning practitioner. I was the president of Kaggle, a community of over 200,000 machine learning practictioners. Kaggle puts up competitions to try and get them to solve previously unsolved problems, and it's been successful hundreds of times. So from this vantage point, I was able to find out a lot about what machine learning can do in the past, can do today, and what it could do in the future. Perhaps the first big success of machine learning commercially was Google. Google showed that it is possible to find information by using a computer algorithm, and this algorithm is based on machine learning. Since that time, there have been many commercial successes of machine learning. Companies like Amazon and Netflix use machine learning to suggest products that you might like to buy, movies that you might like to watch. Sometimes, it's almost creepy. Companies like LinkedIn and Facebook sometimes will tell you about who your friends might be and you have no idea how it did it, and this is because it's using the power of machine learning. These are algorithms that have learned how to do this from data rather than being programmed by hand.

So konnte auch IBM Watson dazu bringen, die zwei Weltmeister der Quizshow "Jeopardy" zu schlagen, wo man knifflige, komplexe Fragen beantworten musste, z. B.: ["2003 verschwand u. a. der antike 'Löwe von Nimrud' aus dem Museum dieser Stadt."] Daher gibt es nun erste selbstfahrende Autos. Will man den Unterschied etwa zwischen Baum und Fußgänger erkennen, ist das wichtig. Wir wissen nicht, wie man solche Programme schreibt, aber durch Maschinelles Lernen ist das jetzt möglich. Dieses Auto ist schon über 1 Mio. km ohne den kleinsten Unfall auf normalen Straßen gefahren.

This is also how IBM was successful in getting Watson to beat the two world champions at "Jeopardy," answering incredibly subtle and complex questions like this one. ["The ancient 'Lion of Nimrud' went missing from this city's national museum in 2003 (along with a lot of other stuff)"] This is also why we are now able to see the first self-driving cars. If you want to be able to tell the difference between, say, a tree and a pedestrian, well, that's pretty important. We don't know how to write those programs by hand, but with machine learning, this is now possible. And in fact, this car has driven over a million miles without any accidents on regular roads.

Wir wissen also, dass Computer lernen können und dass sie auch Dinge lernen können, von denen wir nicht wissen, wie sie funktionieren, und manchmal sogar besser als wir. Eines der faszinierendsten Beispiele für Maschinelles Lernen habe ich bei einem meiner Kaggle-Projekte gesehen, als ein Team unter der Leitung von Geoffrey Hinton von der Universität Toronto den Wettstreit für automatische Drogenerkennung gewann. Außergewöhnlich war daran nicht nur ihr Sieg gegen all die Algorithmen von Merck und der internationalen akademischen Welt, sondern, dass das Team kein Vorwissen zu Chemie oder Biowissenschaften hatte und nur zwei Wochen brauchte. Wie haben sie das gemacht? Sie nutzten einen besonderen Algorithmus namens Deep Learning. Ihr Erfolg war so bedeutend, dass er wenig später auf der Titelseite der NY Times erschien. Hier auf der linken Seite sehen Sie Geoffrey Hinton. Deep Learning basiert auf der Funktion des menschlichen Gehirns und deswegen ist es ein Algorithmus, dessen Funktion theoretisch keine Grenzen gesetzt sind. Je mehr Daten und Rechenzeit man hat, desto besser wird er.

So we now know that computers can learn, and computers can learn to do things that we actually sometimes don't know how to do ourselves, or maybe can do them better than us. One of the most amazing examples I've seen of machine learning happened on a project that I ran at Kaggle where a team run by a guy called Geoffrey Hinton from the University of Toronto won a competition for automatic drug discovery. Now, what was extraordinary here is not just that they beat all of the algorithms developed by Merck or the international academic community, but nobody on the team had any background in chemistry or biology or life sciences, and they did it in two weeks. How did they do this? They used an extraordinary algorithm called deep learning. So important was this that in fact the success was covered in The New York Times in a front page article a few weeks later. This is Geoffrey Hinton here on the left-hand side. Deep learning is an algorithm inspired by how the human brain works, and as a result it's an algorithm which has no theoretical limitations on what it can do. The more data you give it and the more computation time you give it, the better it gets.

Die New York Times zeigte in ihrem Artikel noch ein Resultat des Deep Learning, das ich Ihnen jetzt vorstellen will. Es beweist, dass Computer zuhören und verstehen können.

The New York Times also showed in this article another extraordinary result of deep learning which I'm going to show you now. It shows that computers can listen and understand.

Richard Rashid (Video): Als letzten Schritt in diesem Prozess werde ich Chinesisch mit Ihnen sprechen. Als wichtigsten Schritt haben wir anhand großer Informationsmengen von vielen Chinesisch-Sprechern ein Text-zu-Sprache-System gebaut, das chinesischen Text in chinesche Sprache umwandelt, und dann haben wir eine etwa einstündige Aufnahme meiner Stimme benutzt, um das Text-zu-Sprache-System so zu ändern, dass es wie ich klingt. Wieder ist das Ergebnis nicht perfekt. Eigentlich hat es sogar ganz schön viele Fehler. (Auf Chinesisch) (Applaus) In diesem Bereich ist noch viel zu tun. (Chinesisch) (Applaus)

(Video) Richard Rashid: Now, the last step that I want to be able to take in this process is to actually speak to you in Chinese. Now the key thing there is, we've been able to take a large amount of information from many Chinese speakers and produce a text-to-speech system that takes Chinese text and converts it into Chinese language, and then we've taken an hour or so of my own voice and we've used that to modulate the standard text-to-speech system so that it would sound like me. Again, the result's not perfect. There are in fact quite a few errors. (In Chinese) (Applause) There's much work to be done in this area. (In Chinese) (Applause)

Jeremy Howard: Das war eine Konferenz zu Maschinellem Lernen in China. Übrigens hört man bei akademischen Konferenzen nur ganz selten Zwischenapplaus, obwohl das bei TEDx-Konferenzen durchaus erwünscht sein kann. Was Sie eben gesehen haben, basiert auf Deep Learning. (Applaus) Danke! Die englische Transkription war Deep Learning. Die Übersetzung ins Chinesische und der Text rechts oben – Deep Learning und die Modellierung der Stimme -- ebenfalls Deep Learning.

Jeremy Howard: Well, that was at a machine learning conference in China. It's not often, actually, at academic conferences that you do hear spontaneous applause, although of course sometimes at TEDx conferences, feel free. Everything you saw there was happening with deep learning. (Applause) Thank you. The transcription in English was deep learning. The translation to Chinese and the text in the top right, deep learning, and the construction of the voice was deep learning as well.

Deep Learning ist also eine außergewöhnliche Sache. Es ist ein einziger Algorithmus, der scheinbar fast alles kann und ich fand heraus, dass er ein Jahr zuvor sehen gelernt hatte. Bei einem obskuren Wettbewerb der Ruhr-Universität Bochum zum Erkennen von Verkehrszeichen hat Deep Learning gelernt, Verkehrszeichen wie dieses zu erkennen. Er konnte Verkehrszeichen nicht nur besser als andere Algorithmen erkennen; die Rangliste zeigte, dass er sogar Menschen übertraf und zwar um das Doppelte. 2011 gab es also das erste Beispiel für Computer, die besser sehen können als Menschen. Seitdem ist viel passiert. 2012 gab Google bekannt, dass sie einen Deep-Learning-Algorithmus Youtube Videos schauen ließen und die Daten auf 16 000 Computern einen Monat lang berechnen ließen und dass der Computer allein Konzepte wie Menschen oder Katzen einzig durch das Betrachten von Videos erkannt hat. Menschen lernen sehr ähnlich. Sie lernen nicht, indem man ihnen sagt, was sie sehen, sondern sie lernen selbst, was diese Dinge sind. Übrigens hat 2012 Geoffrey Hinton, den wir vorher gesehen haben, den beliebten ImageNet-Wettbewerb mit seinem Versuch gewonnen, auf 1,5 Mio. Bildern die Motive zu erkennen. 2014 sind wir mittlerweile nur noch bei einer 6%igen Fehlerrate bei der Bilderkennung. Das ist wiederum besser als Menschen.

So deep learning is this extraordinary thing. It's a single algorithm that can seem to do almost anything, and I discovered that a year earlier, it had also learned to see. In this obscure competition from Germany called the German Traffic Sign Recognition Benchmark, deep learning had learned to recognize traffic signs like this one. Not only could it recognize the traffic signs better than any other algorithm, the leaderboard actually showed it was better than people, about twice as good as people. So by 2011, we had the first example of computers that can see better than people. Since that time, a lot has happened. In 2012, Google announced that they had a deep learning algorithm watch YouTube videos and crunched the data on 16,000 computers for a month, and the computer independently learned about concepts such as people and cats just by watching the videos. This is much like the way that humans learn. Humans don't learn by being told what they see, but by learning for themselves what these things are. Also in 2012, Geoffrey Hinton, who we saw earlier, won the very popular ImageNet competition, looking to try to figure out from one and a half million images what they're pictures of. As of 2014, we're now down to a six percent error rate in image recognition. This is better than people, again.

Maschinen sind dabei also außergewöhnlich gut und das wird nun auch in der Wirtschaft genutzt. Zum Beispiel hat Google letztes Jahr bekanntgegeben, dass sie jeden Ort Frankreichs in nur 2 Stunden kartografiert hätten, indem sie Street-View-Bilder in einen Deep-Learning-Algorithmus einspeisten, der dann Hausnummern erkennen und lesen konnte. Davor hätte es dutzende Leute und viele Jahre gebraucht. Dasselbe passiert in China. Baidu ist sowas wie das chinesische Google, und was Sie hier oben links sehen, ist z. B. ein Bild, das ich in Baidus Deep-Learning-System hochgeladen habe. Darunter sehen Sie, dass das System das Bild verstanden und ähnliche Bilder gefunden hat. Die ähnlichen Bilder haben ähnliche Hintergründe, ähnliche Gesichts-Ausrichtung, manche sogar die rausgestreckte Zunge. Das System schaut eindeutig nicht auf den Text einer Website. Es hatte nur ein Bild. Also haben wir jetzt Computer, die wirklich verstehen, was sie sehen, und daher Datenbanken mit vielen Millionen Bildern in Echtzeit durchsuchen können.

So machines really are doing an extraordinarily good job of this, and it is now being used in industry. For example, Google announced last year that they had mapped every single location in France in two hours, and the way they did it was that they fed street view images into a deep learning algorithm to recognize and read street numbers. Imagine how long it would have taken before: dozens of people, many years. This is also happening in China. Baidu is kind of the Chinese Google, I guess, and what you see here in the top left is an example of a picture that I uploaded to Baidu's deep learning system, and underneath you can see that the system has understood what that picture is and found similar images. The similar images actually have similar backgrounds, similar directions of the faces, even some with their tongue out. This is not clearly looking at the text of a web page. All I uploaded was an image. So we now have computers which really understand what they see and can therefore search databases

Aber was bedeutet es nun, dass Computer sehen können? Tja, es ist nicht nur so, dass sie sehen. Genau genommen kann Deep Leaning noch mehr. Komplexe, differenzierte Sätze wie dieser können nun mit Deep-Learning-Algorithmen verstanden werden. Wie Sie hier sehen können, zeigt dieses System aus Stanford mit dem roten Punkt oben, dass es die negative Botschaft des Satzes erkannt hat. Deep Learning ist jetzt fast so gut wie Menschen im Verstehen, worum es in Sätzen geht und was gesagt wird. Deep Learning wird auch genutzt, um Chinesisch zu lesen wieder fast auf Muttersprachler-Niveau. Der Algorithmus dafür stammt von Leuten aus der Schweiz, die allesamt kein Chinesisch sprechen oder verstehen. Wie ich schon sagte: Deep Learning ist so ziemlich das beste System der Welt dafür, sogar im Vergleich mit dem Wissen von Muttersprachlern.

of hundreds of millions of images in real time. So what does it mean now that computers can see? Well, it's not just that computers can see. In fact, deep learning has done more than that. Complex, nuanced sentences like this one are now understandable with deep learning algorithms. As you can see here, this Stanford-based system showing the red dot at the top has figured out that this sentence is expressing negative sentiment. Deep learning now in fact is near human performance at understanding what sentences are about and what it is saying about those things. Also, deep learning has been used to read Chinese, again at about native Chinese speaker level. This algorithm developed out of Switzerland by people, none of whom speak or understand any Chinese. As I say, using deep learning is about the best system in the world for this, even compared to native human understanding.

Dieses System haben wir in meiner Firma entworfen, das all diesen Kram zusammenfügt. Das sind Bilder ohne angehängten Text und während ich diese Sätze hier eintippe, versteht das System die Bilder in Echtzeit und erkennt, was sie zeigen, und findet ähnliche Bilder zu dem eingetippten Text. Sie sehen also, es versteht wirklich meine Sätze und ebenso diese Bilder. Ich weiß, dass Sie sowas Ähnliches von Google kennen, wo man Text eingeben kann und einem Bilder gezeigt werden, aber da wird nur die Website nach dem Text durchsucht. Das ist ein großer Unterschied dazu, die Bilder zu verstehen. Letzteres haben Computer erst vor ein paar Monaten gelernt.

This is a system that we put together at my company which shows putting all this stuff together. These are pictures which have no text attached, and as I'm typing in here sentences, in real time it's understanding these pictures and figuring out what they're about and finding pictures that are similar to the text that I'm writing. So you can see, it's actually understanding my sentences and actually understanding these pictures. I know that you've seen something like this on Google, where you can type in things and it will show you pictures, but actually what it's doing is it's searching the webpage for the text. This is very different from actually understanding the images. This is something that computers have only been able to do for the first time in the last few months.

Wir haben gesehen, dass Computer nicht nur sehen, sondern auch lesen können. Wir haben natürlich auch gesehen, dass sie verstehen, was sie hören. Vielleicht sind Sie nicht überrascht, dass sie auch schreiben können. Diesen Text habe ich gestern mit einem Deep-Learning-Algorithmus erzeugt. Diesen Text hier hat ein Algorithmus aus Stanford erzeugt. Jeder dieser Sätze wurde mit einem Deep-Learning-Algorithmus erzeugt, um das jeweilige Bild zu beschreiben. Vorher hat der Algorithmus nie einen Mann im schwarzen Hemd Gitarre spielen sehen. Er hat einen Mann, die Farbe Schwarz, und eine Gitarre gesehen, aber er hat selbstständig diese neue Bildbeschreibung erstellt. Menschliche Leistung ist das noch nicht, aber nah dran. In Tests bevorzugen Menschen die computer-generierte Bildbeschreibung nur eines von vier Malen. Aber das System ist jetzt erst 2 Wochen alt, sodass wahrscheinlich im nächsten Jahr der Computeralgorithmus die menschliche Leistung übertrifft, so schnell wie die Dinge gerade gehen. Computer können also auch schreiben.

So we can see now that computers can not only see but they can also read, and, of course, we've shown that they can understand what they hear. Perhaps not surprising now that I'm going to tell you they can write. Here is some text that I generated using a deep learning algorithm yesterday. And here is some text that an algorithm out of Stanford generated. Each of these sentences was generated by a deep learning algorithm to describe each of those pictures. This algorithm before has never seen a man in a black shirt playing a guitar. It's seen a man before, it's seen black before, it's seen a guitar before, but it has independently generated this novel description of this picture. We're still not quite at human performance here, but we're close. In tests, humans prefer the computer-generated caption one out of four times. Now this system is now only two weeks old, so probably within the next year, the computer algorithm will be well past human performance at the rate things are going. So computers can also write.

Wenn wir das alles kombinieren, kriegen wir sehr spannenden Möglichkeiten. In der Medizin, zum Beispiel, hat ein Team aus Boston verkündet, dass es Dutzende neue klinisch relevante Merkmale von Tumoren entdeckt hätte, die Ärzten bei der Krebsprognose helfen. Ähnlich hat in Stanford eine Gruppe bekanntgegeben, dass sie für die Gewebeanalyse in vergrößerter Aufnahme ein Maschinelles Lernsystem entwickelt haben, das menschliche Pathologen tatsächlich dabei übertrifft, die Überlebenschancen von Krebspatienten vorherzusagen. In beiden Fällen waren die Vorhersagen nicht nur genauer, sie förderten auch neue wissenschaftliche Erkenntnisse. Im Fall der Radiologie waren es neue klinische Indikatoren, die Menschen verstehen. Im Fall der Pathologie hat das Computersystem herausgefunden, dass die Zellen rund um den Krebs genauso wichtig sind wie die Krebszellen selbst beim Erstellen der Diagnose. Das ist das Gegenteil davon, was man Pathologen jahrzehntelang beibrachte. In beiden Fällen wurden die Systeme gemeinsam von Experten der Medizin und des Maschinellen Lernens entwickelt, aber seit letztem Jahr haben wir auch das überwunden. Das hier ist ein Beispiel, wie man krebsgeschädigte Bereiche menschlichen Gewebes unter dem Mikroskop erkennt. Das hier gezeigte System erkennt solche Bereiche genauer, oder etwa gleich genau, wie menschliche Pathologen, aber es wurde allein mit Deep Learning, ohne medizinisches Wissen, von Leuten ohne Ausbildung in diesem Feld entwickelt. Ähnlich ist es bei dieser Neuronen-Segmentierung. Neuronen können jetzt damit etwa so genau wie durch Menschen segmentieren werden, aber dieses System wurde mit Deep Learning von Leuten ohne medizinisches Vorwissen entwickelt.

So we put all this together and it leads to very exciting opportunities. For example, in medicine, a team in Boston announced that they had discovered dozens of new clinically relevant features of tumors which help doctors make a prognosis of a cancer. Very similarly, in Stanford, a group there announced that, looking at tissues under magnification, they've developed a machine learning-based system which in fact is better than human pathologists at predicting survival rates for cancer sufferers. In both of these cases, not only were the predictions more accurate, but they generated new insightful science. In the radiology case, they were new clinical indicators that humans can understand. In this pathology case, the computer system actually discovered that the cells around the cancer are as important as the cancer cells themselves in making a diagnosis. This is the opposite of what pathologists had been taught for decades. In each of those two cases, they were systems developed by a combination of medical experts and machine learning experts, but as of last year, we're now beyond that too. This is an example of identifying cancerous areas of human tissue under a microscope. The system being shown here can identify those areas more accurately, or about as accurately, as human pathologists, but was built entirely with deep learning using no medical expertise by people who have no background in the field. Similarly, here, this neuron segmentation. We can now segment neurons about as accurately as humans can, but this system was developed with deep learning using people with no previous background in medicine.

Sogar ich, als jemand ohne medizinische Ausbildung, scheine nun genug für die Gründung eines medizinisches Unternehmens zu wissen -- und das habe ich auch. Ich hatte irgendwie Angst davor, aber theoretisch schien es möglich zu sein, in der Medizin sehr nützliche Dinge allein mit solchen Datenanalysen zu bewirken. Glücklicherweise war das Feedback fantastisch, sowohl von den Medien als auch von Medizinern, die mich sehr unterstützt haben. Theoretisch können wir den Mittelteil des medizinischen Vorgangs so viel wie möglich der Datenanalyse überlassen, sodass Ärzte nur noch tun müssen, was sie am besten können. Ich will Ihnen ein Beispiel geben. Aktuell brauchen wir 15 Minuten, um einen neuen medizinischen Diagnosetest zu bauen. Das zeige ich Ihnen jetzt in Echtzeit, aber ich habe es durch Zusammenschneiden auf 3 Minuten gekürzt. Anstatt Ihnen das Erstellen eines medizinischen Tests zu zeigen, zeige ich Ihnen einen Diagnosetest für Autobilder, denn das verstehen wir alle.

So myself, as somebody with no previous background in medicine, I seem to be entirely well qualified to start a new medical company, which I did. I was kind of terrified of doing it, but the theory seemed to suggest that it ought to be possible to do very useful medicine using just these data analytic techniques. And thankfully, the feedback has been fantastic, not just from the media but from the medical community, who have been very supportive. The theory is that we can take the middle part of the medical process and turn that into data analysis as much as possible, leaving doctors to do what they're best at. I want to give you an example. It now takes us about 15 minutes to generate a new medical diagnostic test and I'll show you that in real time now, but I've compressed it down to three minutes by cutting some pieces out. Rather than showing you creating a medical diagnostic test, I'm going to show you a diagnostic test of car images, because that's something we can all understand.

Hier fangen wir mit ungefähr 1,5 Mio. Autobildern an, und ich möchte etwas bauen, das sie nach dem Winkel sortiert, in dem das Foto gemacht wurde. Diese Bilder sind jetzt noch nicht benannt, ich fange bei Null an. Unser Deep-Learning-Algorithmus erkennt automatisch Strukturflächen auf den Bildern. Das Schöne ist, dass Mensch und Computer jetzt zusammenarbeiten können. Wie Sie hier sehen, gibt der Mensch dem Computer Zielbereiche vor, womit der Computer dann versuchen soll, seinem Algorithmus zu verbessern. Eigentlich sind diese Deep-Learning- Systeme im 16 000-dimensionalen Raum, hier können Sie den Computer das durch den Raum auf der Suche nach neuen Strukturflächen rotieren sehen. Wenn er dabei Erfolg hat, kann der menschliche Betreiber dann die interessanten Bereiche festlegen. Hier hat der Computer Bereiche gefunden, zum Beispiel Winkel. Im Verlauf des Prozesses sagen wir dem Computer immer mehr über die gesuchten Strukturen. Bei einem Diagnose-Test zum Beispiel würde das dem Pathologen helfen, kranke Bereiche zu identifizieren, oder dem Radiologen bei potentiell gefährlichen Knoten. Manchmal wird es schwer für den Algorithmus. In diesem Fall war er etwas verwirrt. Die Vorder- und Rückseiten der Autos sind vermischt. Wir müssen hier also sorgfältiger sein, und die Vorderseiten manuell von den Rückseiten trennen, um dann dem Computer zu sagen, dass das Teil einer Gruppe ist, die uns interessiert.

So here we're starting with about 1.5 million car images, and I want to create something that can split them into the angle of the photo that's being taken. So these images are entirely unlabeled, so I have to start from scratch. With our deep learning algorithm, it can automatically identify areas of structure in these images. So the nice thing is that the human and the computer can now work together. So the human, as you can see here, is telling the computer about areas of interest which it wants the computer then to try and use to improve its algorithm. Now, these deep learning systems actually are in 16,000-dimensional space, so you can see here the computer rotating this through that space, trying to find new areas of structure. And when it does so successfully, the human who is driving it can then point out the areas that are interesting. So here, the computer has successfully found areas, for example, angles. So as we go through this process, we're gradually telling the computer more and more about the kinds of structures we're looking for. You can imagine in a diagnostic test this would be a pathologist identifying areas of pathosis, for example, or a radiologist indicating potentially troublesome nodules. And sometimes it can be difficult for the algorithm. In this case, it got kind of confused. The fronts and the backs of the cars are all mixed up. So here we have to be a bit more careful, manually selecting these fronts as opposed to the backs, then telling the computer that this is a type of group that we're interested in.

Das machen wir für eine Weile, wir springen ein wenig weiter, und dann trainieren wir den Algorithmus, basierend auf diesen paar hundert Sachen, und hoffen, dass er besser geworden ist. Wie Sie sehen, lässt er einige dieser Bilder jetzt verblassen und zeigt uns, dass er schon jetzt ein wenig selbst erkennt. Wir können das Konzept der ähnlichen Bilder nutzen und dabei sehen Sie, dass der Computer jetzt in der Lage ist, nur die Vorderseiten der Autos zu finden. Also kann der Mensch dem Computer an diesem Punkt sagen, okay, du hast gute Arbeit geleistet.

So we do that for a while, we skip over a little bit, and then we train the machine learning algorithm based on these couple of hundred things, and we hope that it's gotten a lot better. You can see, it's now started to fade some of these pictures out, showing us that it already is recognizing how to understand some of these itself. We can then use this concept of similar images, and using similar images, you can now see, the computer at this point is able to entirely find just the fronts of cars. So at this point, the human can tell the computer, okay, yes, you've done a good job of that.

Natürlich ist es manchmal selbst hier schwer, die einzelnen Gruppen zu unterscheiden. Selbst nachdem der Computer die Bilder eine Weile rotiert hat, sind die rechten und linken Seiten der Bilder immer noch komplett durcheinander. Wieder können wir dem Computer Hinweise geben und sagen, okay, jetzt versuch mal einen Weg, der die rechte und linke Seite so gut wie möglich mit dem Deep-Learning-Algorithmus trennt. Und mit diesem Hinweis -- ah, okay, jetzt hat er Erfolg. Er hat einen Weg gefunden, diese Objekte so sehen, der diese hier aussortiert hat.

Sometimes, of course, even at this point it's still difficult to separate out groups. In this case, even after we let the computer try to rotate this for a while, we still find that the left sides and the right sides pictures are all mixed up together. So we can again give the computer some hints, and we say, okay, try and find a projection that separates out the left sides and the right sides as much as possible using this deep learning algorithm. And giving it that hint -- ah, okay, it's been successful. It's managed to find a way of thinking about these objects that's separated out these together.

Sie haben jetzt einen Eindruck davon. Das ist kein Fall, wo der Mensch von einem Computer ersetzt wird, sondern sie arbeiten zusammen. Wir ersetzen hier etwas, wofür man früher ein Team von fünf oder sechs Leuten 7 Jahre beschäftigt hat, durch etwas, das 15 Minuten für eine einzige Person braucht.

So you get the idea here. This is a case not where the human is being replaced by a computer, but where they're working together. What we're doing here is we're replacing something that used to take a team of five or six people about seven years and replacing it with something that takes 15 minutes for one person acting alone.

Dieser Vorgang braucht ungefähr vier oder fünf Durchgänge. Wie Sie sehen, sind wir nun bei 62 % korrekt klassifizierten Bildern aus 1,5 Millionen. An dieser Stelle können wir anfangen, sehr schnell große Bereiche zu erfassen, und sie auf Fehler zu überprüfen. Wenn es Fehler gibt, lassen wir das den Computer wissen. Indem wir diesen Vorgang auf jede der einzelnen Gruppen anwenden, sind wir jetzt bei einer 80%igen Erfolgsrate beim Klassifizieren der 1,5 Mio. Bilder. An diesem Punkt müssen wir nur noch die kleine Zahl der falsch klassifizierten Bilder finden und versuchen, die Ursache zu verstehen. Wenden wir das an, sind wir nach 15 Minuten bei einer Erfolgsquote von 97 %.

So this process takes about four or five iterations. You can see we now have 62 percent of our 1.5 million images classified correctly. And at this point, we can start to quite quickly grab whole big sections, check through them to make sure that there's no mistakes. Where there are mistakes, we can let the computer know about them. And using this kind of process for each of the different groups, we are now up to an 80 percent success rate in classifying the 1.5 million images. And at this point, it's just a case of finding the small number that aren't classified correctly, and trying to understand why. And using that approach, by 15 minutes we get to 97 percent classification rates.

Also könnten wir mit dieser Technik ein großes Problem beheben, nämlich, das Fehlen medizinischen Fachwissens in der Welt. Laut Weltwirtschaftsforum gibt es zwischen 10x und 20x zu wenige Ärzte in Entwicklungsländern und es würde etwa 300 Jahre dauern, genug Leute auszubilden, um das Problem zu beheben. Können Sie sich vorstellen, dass wir ihre Effizienz mit diesen Deep-Learning-Ansätzen steigern können?

So this kind of technique could allow us to fix a major problem, which is that there's a lack of medical expertise in the world. The World Economic Forum says that there's between a 10x and a 20x shortage of physicians in the developing world, and it would take about 300 years to train enough people to fix that problem. So imagine if we can help enhance their efficiency using these deep learning approaches?

Ich bin ganz begeistert von den Möglichkeiten. Ich mache mir auch Sorgen über die Probleme. Das Problem hierbei ist, in jedem blauen Bereich auf der Karte machen Dienstleistungen über 80 % der Beschäftigung aus. Was sind Dienstleistungen? Das sind Dienstleistungen. Das sind außerdem genau die Dinge, die Computer gerade gelernt haben. Also sind 80 % der Beschäftigung der entwickelten Welt Dinge, die Computer gerade gelernt haben. Was bedeutet das? Naja, es wird alles gut. Andere Jobs ersetzen diese. Zum Beispiel wird es mehr Jobs für Informatiker geben. Nun, nicht ganz. Informatiker brauchen nicht lange, diese Dinge zu bauen. Zum Beispiel wurden diese 4 Algorithmen vom selben Typen gebaut. Wenn Sie also denken, oh, das ist alles nicht neu, wir haben in der Vergangenheit gesehen, wenn etwas Neues kommt, werden sie durch neue Jobs ersetzt, was also sind diese neuen Jobs? Das ist sehr schwer einzuschätzen, weil menschliche Leistung schrittweise wächst, aber wir haben jetzt ein System, Deep Learning, das seine Leistung nachweislich exponentiell steigert. Und da sind wir. Zurzeit sehen wir die Dinge um uns herum und sagen "Computer sind immer noch ziemlich dumm." Oder? Aber in fünf Jahren werden Computer nicht mehr Teil dieser Tabelle sein. Wir müssen also schon jetzt anfangen, über diese Leistung nachzudenken.

So I'm very excited about the opportunities. I'm also concerned about the problems. The problem here is that every area in blue on this map is somewhere where services are over 80 percent of employment. What are services? These are services. These are also the exact things that computers have just learned how to do. So 80 percent of the world's employment in the developed world is stuff that computers have just learned how to do. What does that mean? Well, it'll be fine. They'll be replaced by other jobs. For example, there will be more jobs for data scientists. Well, not really. It doesn't take data scientists very long to build these things. For example, these four algorithms were all built by the same guy. So if you think, oh, it's all happened before, we've seen the results in the past of when new things come along and they get replaced by new jobs, what are these new jobs going to be? It's very hard for us to estimate this, because human performance grows at this gradual rate, but we now have a system, deep learning, that we know actually grows in capability exponentially. And we're here. So currently, we see the things around us and we say, "Oh, computers are still pretty dumb." Right? But in five years' time, computers will be off this chart. So we need to be starting to think about this capability right now.

Wir haben das natürlich schon mal gesehen. Die Industrielle Revolution bewirkte einen Evolutionssprung der Leistung durch Motoren. Aber nach einer Weile beruhigten sich die Dinge. Es gab soziale Umbrüche, aber sobald die Motoren damals zur Energiegewinnung genutzt wurden, beruhigten sich die Dinge. Die Revolution des Maschinellen Lernens wird ganz anders als die Industrielle Revolution, weil die Revolution nie zu Ende ist. Je besser Computer bei intellektuellen Aktivitäten werden, desto bessere Computer können sie bauen, die intellektuell noch leistungsfähiger sind, also wird das eine Art Wandel, den die Welt nie zuvor gesehen hat, sodass sich Ihr Verständnis des Möglichen ändert.

We have seen this once before, of course. In the Industrial Revolution, we saw a step change in capability thanks to engines. The thing is, though, that after a while, things flattened out. There was social disruption, but once engines were used to generate power in all the situations, things really settled down. The Machine Learning Revolution is going to be very different from the Industrial Revolution, because the Machine Learning Revolution, it never settles down. The better computers get at intellectual activities, the more they can build better computers to be better at intellectual capabilities, so this is going to be a kind of change that the world has actually never experienced before, so your previous understanding of what's possible is different.

Das beeinflusst uns schon jetzt. In den letzten 25 Jahren ist die Produktivität des Kapitals gestiegen, aber die Produktivität der Arbeit blieb gleich und sank sogar ein bisschen.

This is already impacting us. In the last 25 years, as capital productivity has increased, labor productivity has been flat, in fact even a little bit down.

Deswegen will ich, dass wir diese Diskussion jetzt führen. Wenn ich Leuten von dieser Situation erzähle, sind sie oft sehr abschätzig. Computer denken nicht wirklich, sie fühlen nichts, sie verstehen Lyrik nicht, wir verstehen nicht wirklich, wie sie funktionieren. Ja, und? Computer können jetzt Dinge tun, für die Menschen ihre meiste Zeit gegen Bezahlung aufwenden. Wir sollten also jetzt überlegen, wie wir unsere sozialen und wirtschaftlichen Strukturen anpassen, um diese neue Realität zu erkennen. Danke. (Applaus)

So I want us to start having this discussion now. I know that when I often tell people about this situation, people can be quite dismissive. Well, computers can't really think, they don't emote, they don't understand poetry, we don't really understand how they work. So what? Computers right now can do the things that humans spend most of their time being paid to do, so now's the time to start thinking about how we're going to adjust our social structures and economic structures to be aware of this new reality. Thank you. (Applause)

Die New York Times zeigte in ihrem Artikel noch ein Resultat des Deep Learning, das ich Ihnen jetzt vorstellen will. Es beweist, dass Computer zuhören und verstehen können.

The New York Times also showed in this article another extraordinary result of deep learning which I'm going to show you now. It shows that computers can listen and understand.

Das beeinflusst uns schon jetzt. In den letzten 25 Jahren ist die Produktivität des Kapitals gestiegen, aber die Produktivität der Arbeit blieb gleich und sank sogar ein bisschen.

This is already impacting us. In the last 25 years, as capital productivity has increased, labor productivity has been flat, in fact even a little bit down.