Jeremy Howard: The wonderful and terrifying implications of computers that can learn

It used to be that if you wanted to get a computer to do something new, you would have to program it. Now, programming, for those of you here that haven't done it yourself, requires laying out in excruciating detail every single step that you want the computer to do in order to achieve your goal. Now, if you want to do something that you don't know how to do yourself, then this is going to be a great challenge.

예전에는 컴퓨터가 새로운 일을 하게 만들려면 프로그램을 짜야 했습니다. 프로그래밍을 해본 적이 없는 분들은 목표를 달성하기 위해서 컴퓨터가 해야 할 일을 매 단계마다 고통스러울정도로 세세하게 설정해야 합니다. 자, 하는 방법을 모르는 일을 여러분이 하고 싶다면 그건 아주 커다란 도전이 되겠죠.

So this was the challenge faced by this man, Arthur Samuel. In 1956, he wanted to get this computer to be able to beat him at checkers. How can you write a program, lay out in excruciating detail, how to be better than you at checkers? So he came up with an idea: he had the computer play against itself thousands of times and learn how to play checkers. And indeed it worked, and in fact, by 1962, this computer had beaten the Connecticut state champion.

이것이 아서 사무엘이 직면한 도전이었습니다. 1956년 그는 컴퓨터가 서양장기에서 그를 이기기를 바랬습니다. 프로그램을 어떻게 짤 수 있을까요? 서양장기에서 여러분보다 잘하도록 극심한 세부사항을 쓸 수 있을까요? 그는 새로운 생각을 했습니다. 컴퓨터가 스스로와 수천 번의 서양장기를 두게 해서 서양장기 두는 법을 배우게 했습니다. 그 방법은 정말 효과가 있었고 사실 1962년에 이 컴퓨터는 코네티컷 주의 우승자를 무찔렀습니다.

So Arthur Samuel was the father of machine learning, and I have a great debt to him, because I am a machine learning practitioner. I was the president of Kaggle, a community of over 200,000 machine learning practictioners. Kaggle puts up competitions to try and get them to solve previously unsolved problems, and it's been successful hundreds of times. So from this vantage point, I was able to find out a lot about what machine learning can do in the past, can do today, and what it could do in the future. Perhaps the first big success of machine learning commercially was Google. Google showed that it is possible to find information by using a computer algorithm, and this algorithm is based on machine learning. Since that time, there have been many commercial successes of machine learning. Companies like Amazon and Netflix use machine learning to suggest products that you might like to buy, movies that you might like to watch. Sometimes, it's almost creepy. Companies like LinkedIn and Facebook sometimes will tell you about who your friends might be and you have no idea how it did it, and this is because it's using the power of machine learning. These are algorithms that have learned how to do this from data rather than being programmed by hand.

그래서 아서 사무엘은 기계 학습의 아버지였고 저는 그분께 큰 빚을 지고 있죠. 왜냐하면 저는 기계 학습 기술자이니까요. 저는 캐글의 회장인데 캐글은 20만 명이 넘는 기계 학습 기술자들의 동호회입니다. 캐글은 이전까지 풀지 못했던 문제를 해결하기 위한 대회를 주최하는데 수백번 성공했습니다. 그래서 이런 유리한 시점에서 저는 기계 학습이 과거와 현재에 할 수 있는 일과 미래에 할 수 있는 일을 많이 알 수 있었습니다. 아마도 기계 학습이 상업에서 최초로 가장 크게 성공한 것은 구글이었습니다. 구글은 컴퓨터 알고리즘을 사용해서 정보를 찾을 수 있음을 보여줬는데 이 알고리즘은 기계 학습을 바탕으로 합니다. 그때부터 기계 학습의 상업적 성공이 많이 있었습니다. 아마존과 넷플릭스 같은 회사들은 기계 학습을 이용해서 여러분이 사고 싶은 상품이나 보고 싶은 영화를 제안합니다. 때로는 오싹할 지경이죠. 링크드인과 페이스북 같은 회사들은 누가 여러분의 친구인지를 말해줄 겁니다. 어떻게 그렇게 하는지 여러분은 모릅니다. 그 이유는 기계 학습의 힘을 이용하기 때문이죠. 이 알고리즘은 하는 방법을 손으로 쓴 프로그램 보다는 데이터에서 배웠습니다.

This is also how IBM was successful in getting Watson to beat the two world champions at "Jeopardy," answering incredibly subtle and complex questions like this one. ["The ancient 'Lion of Nimrud' went missing from this city's national museum in 2003 (along with a lot of other stuff)"] This is also why we are now able to see the first self-driving cars. If you want to be able to tell the difference between, say, a tree and a pedestrian, well, that's pretty important. We don't know how to write those programs by hand, but with machine learning, this is now possible. And in fact, this car has driven over a million miles without any accidents on regular roads.

IBM이 왓슨을 이용해 "제퍼디"에서 2명의 세계 챔피언을 성공적으로 무찌른 이유이기도 합니다. 이처럼 아주 미묘하고 복잡한 질문에 대답했죠. ["고대 '니무르드의 사자'가 2003년 이 도시의 박물관에서 사라졌습니다." 이 때문에 우리는 이제 최초의 무인 자동차를 볼 수 있습니다. 나무와 보행자의 차이점, 그게 아주 중요한데 그걸 구별하고 싶을 때 손으로 프로그램을 어떻게 써야할지 모르지만 기계 학습으로 이제 가능합니다. 사실 이 자동차는 일반 도로에서 사고 없이 수백만 km를 달렸습니다.

So we now know that computers can learn, and computers can learn to do things that we actually sometimes don't know how to do ourselves, or maybe can do them better than us. One of the most amazing examples I've seen of machine learning happened on a project that I ran at Kaggle where a team run by a guy called Geoffrey Hinton from the University of Toronto won a competition for automatic drug discovery. Now, what was extraordinary here is not just that they beat all of the algorithms developed by Merck or the international academic community, but nobody on the team had any background in chemistry or biology or life sciences, and they did it in two weeks. How did they do this? They used an extraordinary algorithm called deep learning. So important was this that in fact the success was covered in The New York Times in a front page article a few weeks later. This is Geoffrey Hinton here on the left-hand side. Deep learning is an algorithm inspired by how the human brain works, and as a result it's an algorithm which has no theoretical limitations on what it can do. The more data you give it and the more computation time you give it, the better it gets.

이제 우리는 컴퓨터가 배울 수 있고 우리가 실제로 하는 방법을 모르는 일도 할 수 있도록 배울 수 있음을 압니다. 어쩌면 우리보다 잘할 수도 있어요. 기계 학습에서 가장 놀라운 예가 제가 캐글에서 하는 프로젝트에서 일어났습니다. 토론토 대학 출신의 제프리 힌튼이 이끄는 팀은 자동 신약 개발을 위한 대회에서 이겼습니다. 자, 여기서 놀라운 사실은 그들이 머크 또는 국제 학회가 개발한 알고리즘을 이겼을 뿐만 아니라 어떤 팀원도 화학, 생물학, 생명과학에 관한 지식이 없었다는 점입니다. 그들은 2주안에 완성했죠. 어떻게 했을까요? 그들은 심화 학습이라는 놀라운 알고리즘을 사용했습니다. 이것은 사실 아주 중요해서 몇 주가 지난 뒤 뉴욕 타임즈에서 앞면 기사로 다뤘습니다. 왼쪽이 제프리 힌튼입니다. 심화 학습은 사람의 뇌가 작용하는 방식에 영감을 받아서 만든 알고리즘으로 그 결과 할 수 있는 일에 대한 이론적 한계가 없습니다. 더 많은 데이터와 더 많은 계산 시간을 줄수록 더 좋은 결과를 냅니다.

The New York Times also showed in this article another extraordinary result of deep learning which I'm going to show you now. It shows that computers can listen and understand.

뉴욕 타임즈는 이 기사에서 심화 학습의 또다른 놀라운 결과를 보여줬는데 여러분께 보여드리죠. 컴퓨터가 듣고 이해할 수 있음을 보여줍니다.

(Video) Richard Rashid: Now, the last step that I want to be able to take in this process is to actually speak to you in Chinese. Now the key thing there is, we've been able to take a large amount of information from many Chinese speakers and produce a text-to-speech system that takes Chinese text and converts it into Chinese language, and then we've taken an hour or so of my own voice and we've used that to modulate the standard text-to-speech system so that it would sound like me. Again, the result's not perfect. There are in fact quite a few errors. (In Chinese) (Applause) There's much work to be done in this area. (In Chinese) (Applause)

(영상) 리챠드 라시드: 제가 이 과정에서 마지막으로 보여드릴 단계는 실제 중국어로 말하는 것입니다. 중요한 점은 많은 중국인들로부터 엄청난 양의 정보를 모을 수 있었고 글자를 음성으로 바꾸는 시스템을 만들어 중국 글자를 중국 말로 변환시키고 제 목소리를 한 시간 정도 녹음해서 표준 문자 - 음성 변환 시스템을 조절해서 제 목소리처럼 나도록 만들었습니다. 역시 결과는 완벽하지 않습니다. 사실 오류가 상당히 있었습니다. (중국어) (박수) 아직 많은 작업이 필요합니다. (중국어) (박수)

Jeremy Howard: Well, that was at a machine learning conference in China. It's not often, actually, at academic conferences that you do hear spontaneous applause, although of course sometimes at TEDx conferences, feel free. Everything you saw there was happening with deep learning. (Applause) Thank you. The transcription in English was deep learning. The translation to Chinese and the text in the top right, deep learning, and the construction of the voice was deep learning as well.

제레미 하워드 : 중국에서 열린 기계 학습 회의였습니다. 학술 회의에서 실제로 즉흥적인 박수를 듣기는 쉽지 않죠. 그래도 TEDx 회의에서는 자유롭게 하세요. 거기서 본 모든 것이 심화 학습으로 일어났습니다. (박수) 감사합니다. 영어로 옮겨쓰기는 심화 학습이었죠. 중국어 번역과 오른쪽 위의 글자도 심화 학습이었고 목소리로 재생하는 것 역시 심화 학습이었습니다.

So deep learning is this extraordinary thing. It's a single algorithm that can seem to do almost anything, and I discovered that a year earlier, it had also learned to see. In this obscure competition from Germany called the German Traffic Sign Recognition Benchmark, deep learning had learned to recognize traffic signs like this one. Not only could it recognize the traffic signs better than any other algorithm, the leaderboard actually showed it was better than people, about twice as good as people. So by 2011, we had the first example of computers that can see better than people. Since that time, a lot has happened. In 2012, Google announced that they had a deep learning algorithm watch YouTube videos and crunched the data on 16,000 computers for a month, and the computer independently learned about concepts such as people and cats just by watching the videos. This is much like the way that humans learn. Humans don't learn by being told what they see, but by learning for themselves what these things are. Also in 2012, Geoffrey Hinton, who we saw earlier, won the very popular ImageNet competition, looking to try to figure out from one and a half million images what they're pictures of. As of 2014, we're now down to a six percent error rate in image recognition. This is better than people, again.

그래서 심화 학습은 놀라운 것입니다. 하나의 알고리즘인데 거의 모든 일을 할 수 있어 보입니다. 제가 1년 전에 발견했는데 보는 법도 배웠습니다. 독일의 애매한 대회인 독일 교통 신호 인식 성능평가에서 심화 학습은 이런 교통 신호를 인식하는 법을 배웠습니다. 교통 신호를 인식할 뿐만 아니라 어떤 알고리즘보다 낫고 성적이 사람보다 2배 정도 나은 결과를 보였습니다. 2011년 우리는 사람보다 잘 볼 수 있는 컴퓨터의 첫번째 예를 가졌습니다. 그후로 많은 일이 일어났죠. 2012년 구글은 심화 학습 알고리즘을 만들었다고 발표했습니다. 유튜브 동영상을 보고 한 달에 1만6천 대의 컴퓨터 데이터를 처리해서 컴퓨터는 그냥 동영상을 보는 것만으로 사람과 고양이 같은 개념을 스스로 학습했습니다. 사람이 배우는 방법과 비슷하죠. 사람들은 보는 것을 알려줘서 배우는 게 아니라 그것이 뭔지 스스로 배웁니다. 또한 2012년 우리가 앞서 봤던 제프리 힌튼은 아주 유명한 이미지넷 대회에서 우승했는데 1백만 5천장의 사진을 보고 그게 어떤 사진인지 맞추는 내용이죠. 2014년 이제 영상 인식에서 6%의 오차율까지 내려갔습니다. 이것도 사람보다 낫습니다.

So machines really are doing an extraordinarily good job of this, and it is now being used in industry. For example, Google announced last year that they had mapped every single location in France in two hours, and the way they did it was that they fed street view images into a deep learning algorithm to recognize and read street numbers. Imagine how long it would have taken before: dozens of people, many years. This is also happening in China. Baidu is kind of the Chinese Google, I guess, and what you see here in the top left is an example of a picture that I uploaded to Baidu's deep learning system, and underneath you can see that the system has understood what that picture is and found similar images. The similar images actually have similar backgrounds, similar directions of the faces, even some with their tongue out. This is not clearly looking at the text of a web page. All I uploaded was an image. So we now have computers which really understand what they see and can therefore search databases of hundreds of millions of images in real time.

기계는 정말 놀라울만큼 일을 잘하고 있고 이제 산업에서 사용됩니다. 예를 들어, 구글은 작년에 프랑스의 구석구석을 2시간 안에 지도로 만들었다고 발표했는데 그들이 한 방법은 길거리에서 찍은 사진을 심화 학습 알고리즘에 입력해서 주소를 인식하고 읽게 했습니다. 이전에는 얼마나 오래 걸렸을지 생각해보세요. 수십명의 사람들이 몇 년동안 했겠죠. 이것은 중국에서도 일어나고 있습니다. 바이두는 중국판 구글이라고 제가 추측하는데 왼쪽 위에서 보는 것은 바이두의 심화 학습 시스템에 제가 올린 사진의 예이고 그 아래에 그 사진이 뭔지를 시스템이 이해하고 비슷한 사진들을 찾아놓은 것을 볼 수 있죠. 비슷한 사진들은 실제로 비슷한 배경과 비슷한 얼굴 방향을 갖고 있고 혀를 내민 모습도 비슷하죠. 이것은 웹페이지의 글자를 찾은 게 아닙니다. 제가 올린 것은 사진이었죠. 이제 컴퓨터가 본 것을 정말 이해해서 수천만 장의 사진이 든 데이터베이스를 실시간으로 찾을 수 있습니다.

So what does it mean now that computers can see? Well, it's not just that computers can see. In fact, deep learning has done more than that. Complex, nuanced sentences like this one are now understandable with deep learning algorithms. As you can see here, this Stanford-based system showing the red dot at the top has figured out that this sentence is expressing negative sentiment. Deep learning now in fact is near human performance at understanding what sentences are about and what it is saying about those things. Also, deep learning has been used to read Chinese, again at about native Chinese speaker level. This algorithm developed out of Switzerland by people, none of whom speak or understand any Chinese. As I say, using deep learning is about the best system in the world for this, even compared to native human understanding.

컴퓨터가 볼 수 있다는 게 무슨 의미일까요? 컴퓨터가 볼 수 있다는 것만이 아니라 사실 심화 학습은 더 많은 일을 했습니다. 이렇게 복잡하고 미묘한 문장은 이제 심화 학습 알고리즘으로 이해할 수 있습니다. 여기서 보듯이 위에 있는 빨간점을 보여주는 스탠포드에 있는 시스템은 이 문장이 부정적인 느낌을 표현하는 것을 알아냈습니다. 심화 학습은 이제 사실 사람에 가깝게 문장을 이해하고 그게 어떤 말을 하는지 압니다. 심화 학습은 또한 중국어를 읽는데 사용되었고 중국어 원어민 수준입니다. 이 알고리즘은 스위스에서 개발되었는데 개발자 중 중국어를 할 수 있는 사람이 아무도 없었습니다. 심화 학습을 사용하는 것은 사람의 이해에 비해서도 세계 최고의 시스템에 관한 것입니다.

This is a system that we put together at my company which shows putting all this stuff together. These are pictures which have no text attached, and as I'm typing in here sentences, in real time it's understanding these pictures and figuring out what they're about and finding pictures that are similar to the text that I'm writing. So you can see, it's actually understanding my sentences and actually understanding these pictures. I know that you've seen something like this on Google, where you can type in things and it will show you pictures, but actually what it's doing is it's searching the webpage for the text. This is very different from actually understanding the images. This is something that computers have only been able to do for the first time in the last few months.

이것은 우리가 회사에서 모든 것을 다 통합해서 만든 시스템입니다. 이것들은 글자가 없는 사진들로서 제가 문장을 입력하면 실시간으로 그 사진들을 이해해서 그게 어떤 사진인지 알고 제가 쓰는 글에 대해 비슷한 사진을 찾아줍니다. 보다시피 제가 쓴 글을 이해하고 이 사진들을 실제로 이해합니다. 여러분은 구글에서 이와 비슷한 것을 봤을 텐데 여러분이 글자를 입력하면 사진을 보여줍니다. 하지만 실제로는 그 글자가 있는 웹페이지를 찾는 거죠. 이것은 사진을 실제로 이해하는 것과 아주 다릅니다. 이것은 컴퓨터가 지난 몇 달동안 처음으로 할 수 있었던 일입니다.

So we can see now that computers can not only see but they can also read, and, of course, we've shown that they can understand what they hear. Perhaps not surprising now that I'm going to tell you they can write. Here is some text that I generated using a deep learning algorithm yesterday. And here is some text that an algorithm out of Stanford generated. Each of these sentences was generated by a deep learning algorithm to describe each of those pictures. This algorithm before has never seen a man in a black shirt playing a guitar. It's seen a man before, it's seen black before, it's seen a guitar before, but it has independently generated this novel description of this picture. We're still not quite at human performance here, but we're close. In tests, humans prefer the computer-generated caption one out of four times. Now this system is now only two weeks old, so probably within the next year, the computer algorithm will be well past human performance at the rate things are going. So computers can also write.

이제 컴퓨터는 볼 수 있을 뿐만 아니라 읽을 수도 있고 물론 들은 것도 이해할 수 있음을 봤습니다. 컴퓨터가 쓸 줄 안다고 얘기해도 이제는 놀라지 않으실 거에요. 이것은 심화 학습 알고리즘을 사용해서 어제 제가 만든 글입니다. 이것은 스탠포드에서 만든 알고리즘으로 만든 글입니다. 이 글은 각각의 사진을 설명하기 위해 심화 학습 알고리즘이 만들었습니다. 이 알고리즘은 검은색 셔츠를 입고 기타를 치는 남자를 본 적이 없습니다. 남자를 본 적이 있고 검은 색을 본 적이 있고 기타를 본 적은 있어요. 그런데 스스로 이 사진을 훌륭하게 설명했습니다. 아직도 사람보다는 못하지만 꽤 가까이 왔습니다. 실험에서 사람들은 컴퓨터가 만들어낸 캡션을 4회당 1회 꼴로 좋아했습니다. 이 시스템은 이제 2주가 되었는데 아마도 내년 안으로 지금 진행되는 속도로 봐서 컴퓨터 알고리즘이 사람을 앞지를 것입니다. 컴퓨터는 쓸 수도 있습니다.

So we put all this together and it leads to very exciting opportunities. For example, in medicine, a team in Boston announced that they had discovered dozens of new clinically relevant features of tumors which help doctors make a prognosis of a cancer. Very similarly, in Stanford, a group there announced that, looking at tissues under magnification, they've developed a machine learning-based system which in fact is better than human pathologists at predicting survival rates for cancer sufferers. In both of these cases, not only were the predictions more accurate, but they generated new insightful science. In the radiology case, they were new clinical indicators that humans can understand. In this pathology case, the computer system actually discovered that the cells around the cancer are as important as the cancer cells themselves in making a diagnosis. This is the opposite of what pathologists had been taught for decades. In each of those two cases, they were systems developed by a combination of medical experts and machine learning experts, but as of last year, we're now beyond that too. This is an example of identifying cancerous areas of human tissue under a microscope. The system being shown here can identify those areas more accurately, or about as accurately, as human pathologists, but was built entirely with deep learning using no medical expertise by people who have no background in the field. Similarly, here, this neuron segmentation. We can now segment neurons about as accurately as humans can, but this system was developed with deep learning using people with no previous background in medicine.

그래서 이 모든 기능을 합하면 아주 흥미로운 기회가 생기겠죠. 예를 들어 의학에서 보스턴의 팀은 종양에서 임상적으로 관련된 수십가지의 특징을 새롭게 발견했는데 이것으로 의사들이 암을 예측하는데 도움을 줄 수 있습니다. 스탠포드에서도 비슷하게 한 그룹이 조직을 확대경으로 보는 기계 학습을 기반으로 한 시스템을 개발했는데 사실 암 환자의 생존율을 예측하는데 병리학자보다 낫다고 합니다. 두 경우 모두 예측이 더 정확할 뿐만 아니라 통찰력있는 과학을 새로 만들어냈습니다. 방사선학의 경우 사람이 이해할 수 있는 새로운 임상 징후가 있었습니다. 병리학의 경우 컴퓨터 시스템은 진단을 하는데 실제로 암주변의 세포가 암 세포 만큼이나 중요하다는 사실을 발견했습니다. 이는 병리학자가 수십년동안 가르친 사실과 반대됩니다. 각각의 경우에서 시스템은 의학 전문과와 기계 학습 전문가가 함께 개발했지만 작년에 그걸 뛰어넘었습니다. 이것은 현미경으로 사람의 조직에서 암 조직을 밝히는 예입니다. 여기서 보는 시스템은 암 조직을 더 정확히 판별할 수 있고 병리학자만큼이나 정확하게 판별할 수 있지만 의학 전문가를 쓰지 않고 그 분야에 지식이 전혀 없는 사람들이 심화 학습 만으로 만들었습니다. 마찬가지로 여기 신경 분할인데 사람만큼이나 정확하게 신경을 분할할 수 있지만 이 시스템은 의학에 배경지식이 없는 사람들이 심화 학습을 이용해서 만들었습니다.

So myself, as somebody with no previous background in medicine, I seem to be entirely well qualified to start a new medical company, which I did. I was kind of terrified of doing it, but the theory seemed to suggest that it ought to be possible to do very useful medicine using just these data analytic techniques. And thankfully, the feedback has been fantastic, not just from the media but from the medical community, who have been very supportive. The theory is that we can take the middle part of the medical process and turn that into data analysis as much as possible, leaving doctors to do what they're best at. I want to give you an example. It now takes us about 15 minutes to generate a new medical diagnostic test and I'll show you that in real time now, but I've compressed it down to three minutes by cutting some pieces out. Rather than showing you creating a medical diagnostic test, I'm going to show you a diagnostic test of car images, because that's something we can all understand.

그래서 저처럼 의학에 배경지식이 없는 사람이 새로운 의료 회사를 시작하는데 아주 적합한 사람처럼 보여서 실제로 그렇게 했죠. 공포를 느꼈지만 이론은 이런 데이터 분석기법을 이용해서 아주 유용한 의학이 가능함을 제시해주고 있었죠. 그리고 감사하게도 평가는 좋았습니다. 미디어 뿐만 아니라 의학계에서도 아주 긍정적이었습니다. 그 이론은 의료 과정의 중간 부분을 우리가 가져와서 최대한 데이터 분석을 한 뒤 의사들에게 그들이 잘하는 일을 맡기는 거죠. 예를 보여드리겠습니다. 새로운 의료 진단 실험을 하는데 15분쯤 걸리는데 이제 실시간으로 보여드리죠. 몇 단계를 생략해서 3분으로 줄였습니다. 의료 진단 실험을 하는 것을 보여주는 대신 자동차 사진의 진단 실험을 보여드리겠습니다. 왜냐하면 우리 모두 이해할 수 있는 거니까요.

So here we're starting with about 1.5 million car images, and I want to create something that can split them into the angle of the photo that's being taken. So these images are entirely unlabeled, so I have to start from scratch. With our deep learning algorithm, it can automatically identify areas of structure in these images. So the nice thing is that the human and the computer can now work together. So the human, as you can see here, is telling the computer about areas of interest which it wants the computer then to try and use to improve its algorithm. Now, these deep learning systems actually are in 16,000-dimensional space, so you can see here the computer rotating this through that space, trying to find new areas of structure. And when it does so successfully, the human who is driving it can then point out the areas that are interesting. So here, the computer has successfully found areas, for example, angles. So as we go through this process, we're gradually telling the computer more and more about the kinds of structures we're looking for. You can imagine in a diagnostic test this would be a pathologist identifying areas of pathosis, for example, or a radiologist indicating potentially troublesome nodules. And sometimes it can be difficult for the algorithm. In this case, it got kind of confused. The fronts and the backs of the cars are all mixed up. So here we have to be a bit more careful, manually selecting these fronts as opposed to the backs, then telling the computer that this is a type of group that we're interested in.

여기서 150만 개의 자동차 사진으로 시작하죠. 사진을 찍은 각도로 분류하는 뭔가를 만들고 싶어요. 이 사진들은 모두 제목도 없어서 처음부터 시작해야 됩니다. 심화 학습 알고리즘으로 이 사진들의 구조를 자동으로 구별할 수 있습니다. 좋은 점은 사람과 컴퓨터가 함께 일할 수 있다는 거죠. 사람은 여기서 보다시피 컴퓨터한테 관심분야를 말하고 컴퓨터가 알고리즘을 개선하죠. 자, 이 심화 학습 시스템은 실제로 1만6천 차원의 공간을 가집니다. 컴퓨터가 이것을 그 공간사이로 회전하는 것을 볼 수 있습니다. 새로운 구조를 발견하려는 거죠. 컴퓨터가 성공적으로 끝내면 그걸 작동하는 사람은 관심있는 분야를 가리킵니다. 여기서 컴퓨터는 그 분야를 성공적으로 찾아냈는데 이 경우는 각도이죠. 우리가 이 과정을 거쳐가면서 컴퓨터한테 우리가 찾고 있는 구조에 대해서 단계적으로 더 많이 말해줍니다. 진단 실험에서 병리학자가 병적 상태인 곳을 밝혀내거나 방사선의가 문제가 있을 수 있는 혹을 가르키는 것을 상상할 수 있습니다. 알고리즘에서 어려운 부분도 있습니다. 이 경우 약간 헷갈렸어요. 자동차의 앞과 뒤가 모두 섞여버렸죠. 그래서 여기서 좀더 주의해서 뒤가 아니라 앞을 수동으로 선택해서 컴퓨터에게 우리가 관심있는 부분이 이 부분이라고 얘기를 해야합니다.

So we do that for a while, we skip over a little bit, and then we train the machine learning algorithm based on these couple of hundred things, and we hope that it's gotten a lot better. You can see, it's now started to fade some of these pictures out, showing us that it already is recognizing how to understand some of these itself. We can then use this concept of similar images, and using similar images, you can now see, the computer at this point is able to entirely find just the fronts of cars. So at this point, the human can tell the computer, okay, yes, you've done a good job of that.

그래서 한동안 그 작업을 하고 좀 더 건너뛰면 이런 수백 가지 일을 바탕으로 기계 학습 알고리즘을 훈련시켜 앞으로 더 나아지기를 바랍니다. 보다시피 시스템은 사진들 일부를 사라지게 만들면서 이 사진들을 이해하는 법을 이미 인식하고 있음을 보여줍니다. 우리는 비슷한 사진의 개념을 이용해서 이제 여러분이 보는 것과 같이 이 시점에서 컴퓨터는 자동차의 앞만 찾을 수 있습니다. 이 시점에서 사람은 컴퓨터에게 좋아, 잘 했어. 라고 말할 수 있죠.

Sometimes, of course, even at this point it's still difficult to separate out groups. In this case, even after we let the computer try to rotate this for a while, we still find that the left sides and the right sides pictures are all mixed up together. So we can again give the computer some hints, and we say, okay, try and find a projection that separates out the left sides and the right sides as much as possible using this deep learning algorithm. And giving it that hint -- ah, okay, it's been successful. It's managed to find a way of thinking about these objects that's separated out these together.

물론 어떤 경우는 이 시점에도 그룹으로 나누기가 어렵습니다. 이 경우 컴퓨터가 한동안 이것을 회전하게 내버려둬도 왼쪽과 오른쪽이 뒤섞인 것을 볼 수 있습니다. 그래서 컴퓨터한테 다시 힌트를 줘서 심화 학습 알고리즘을 이용해서 왼쪽과 오른쪽을 가능한 분리시키는 투사도를 찾아라고 합니다. 그 힌트를 주면 성공입니다. 이들 물체들을 분리해내는 방법을 스스로 찾은 거죠.

So you get the idea here. This is a case not where the human is being replaced by a computer, but where they're working together. What we're doing here is we're replacing something that used to take a team of five or six people about seven years and replacing it with something that takes 15 minutes for one person acting alone.

여기서 생각을 얻을 수 있죠. 사람이 컴퓨터로 대체되는 경우가 아니라 함께 일합니다. 우리가 여기서 하는 일은 5-6명의 팀이 7년쯤 걸리는 일을 한 사람이 15분 걸려서 하는 일로 대체합니다.

So this process takes about four or five iterations. You can see we now have 62 percent of our 1.5 million images classified correctly. And at this point, we can start to quite quickly grab whole big sections, check through them to make sure that there's no mistakes. Where there are mistakes, we can let the computer know about them. And using this kind of process for each of the different groups, we are now up to an 80 percent success rate in classifying the 1.5 million images. And at this point, it's just a case of finding the small number that aren't classified correctly, and trying to understand why. And using that approach, by 15 minutes we get to 97 percent classification rates.

이 과정은 4 - 5 번 반복합니다. 보다시피 150만 장의 사진의 62%가 제대로 분류된 것을 볼 수 있죠. 이 시점에서 우리는 전체를 빠르게 잡아서 실수가 없는지 확인합니다. 실수가 있으면 컴퓨터에게 알리죠. 각각의 다른 그룹에서 이런 과정을 통해 150만 장의 사진을 분류하는데 80%의 성공율을 보입니다. 이 시점에서는 바르게 분류되지 않은 적은 숫자를 찾아 이유를 알아내는 과정입니다. 그런 방식으로 15분 안에 우리는 97%의 분류율을 얻습니다.

So this kind of technique could allow us to fix a major problem, which is that there's a lack of medical expertise in the world. The World Economic Forum says that there's between a 10x and a 20x shortage of physicians in the developing world, and it would take about 300 years to train enough people to fix that problem. So imagine if we can help enhance their efficiency using these deep learning approaches?

이런 기술은 우리가 중요한 문제를 고칠 수 있게 하는데 그것은 세계에서 의료 전문가가 부족하다는 사실입니다. 세계 경제 포럼은 개발도상국에서 10배에서 20배의 의사가 부족하다고 말했는데 그 문제를 고치기 위해 충분한 인원을 교육시키려면 300년이 걸립니다. 이런 심화 학습 방식을 사용해서 그들의 효율을 높일 수 있다고 상상해보세요.

So I'm very excited about the opportunities. I'm also concerned about the problems. The problem here is that every area in blue on this map is somewhere where services are over 80 percent of employment. What are services? These are services. These are also the exact things that computers have just learned how to do. So 80 percent of the world's employment in the developed world is stuff that computers have just learned how to do. What does that mean? Well, it'll be fine. They'll be replaced by other jobs. For example, there will be more jobs for data scientists. Well, not really. It doesn't take data scientists very long to build these things. For example, these four algorithms were all built by the same guy. So if you think, oh, it's all happened before, we've seen the results in the past of when new things come along and they get replaced by new jobs, what are these new jobs going to be? It's very hard for us to estimate this, because human performance grows at this gradual rate, but we now have a system, deep learning, that we know actually grows in capability exponentially. And we're here. So currently, we see the things around us and we say, "Oh, computers are still pretty dumb." Right? But in five years' time, computers will be off this chart. So we need to be starting to think about this capability right now.

저는 그런 기회에 대해 아주 흥분했습니다. 저는 그 문제도 걱정합니다. 여기서 문제는 이 지도에서 파란색으로 표시된 곳은 서비스가 고용의 80% 이상을 차지합니다. 무슨 서비스일까요? 이런 서비스입니다. 이것들은 컴퓨터가 방금 배운 것과 똑같습니다. 개발된 세상에서 고용의 80%가 컴퓨터가 방금 배운 것입니다. 그게 뭘 뜻합니까? 글쎄, 괜찮을거에요. 다른 일자리로 대체되겠죠. 예를 들면, 데이터 과학자한테 더 많은 일이 있을 겁니다. 그렇지 않아요. 데이터 과학자가 이런 것을 만드는데 오래 걸리지 않습니다. 예를 들어, 4가지 알고리즘이 모두 한 사람이 만들었죠. 여러분이 이전에도 이런 일이 벌어졌다고 생각한다면 과거에 새로운 것이 나타났을 때 그 결과를 본 적이 있죠. 새로운 일자리로 대체되었고 새로운 일자리는 어떤 것일까요? 이것을 예측하기가 정말 어렵습니다. 왜냐하면 사람의 성과는 이렇게 점진적인데 심화 학습 시스템은 능력이 기하급수적으로 증가하는 것을 압니다. 우리는 여기에 있죠. 현재 우리는 주변을 보면서 말해요. "컴퓨터는 정말 바보야." 그렇지? 하지만 5년 안에 컴퓨터는 이 도표밖으로 나갈 겁니다. 그래서 이 능력을 지금 당장 생각해야 합니다.

We have seen this once before, of course. In the Industrial Revolution, we saw a step change in capability thanks to engines. The thing is, though, that after a while, things flattened out. There was social disruption, but once engines were used to generate power in all the situations, things really settled down. The Machine Learning Revolution is going to be very different from the Industrial Revolution, because the Machine Learning Revolution, it never settles down. The better computers get at intellectual activities, the more they can build better computers to be better at intellectual capabilities, so this is going to be a kind of change that the world has actually never experienced before, so your previous understanding of what's possible is different.

물론 전에도 이걸 본 적이 있습니다. 산업 혁명에서 엔진 덕분에 능력이 한 단계 달라졌죠. 하지만 시간이 좀 흐른 뒤 오름세가 멈췄습니다. 사회적 분열이 있었지만 엔진을 사용해서 모든 상황에서 동력을 만들어내자 모든게 안정되었죠. 기계 학습 혁명은 산업 혁명과는 아주 다릅니다. 기계 학습 혁명은 절대 안정되지 않을 거니까요. 컴퓨터의 지능활동이 더 나을수록 더 나은 컴퓨터를 만들테고 그 컴퓨터는 지적 능력이 더 뛰어나겠죠. 그래서 이것은 세계가 실제로 경험해본 적이 없는 변화가 될 것입니다. 여러분이 이전에 가능하다고 이해한 것들이 이제는 다릅니다.

This is already impacting us. In the last 25 years, as capital productivity has increased, labor productivity has been flat, in fact even a little bit down.

이것은 이미 우리에게 영향을 주고 있습니다. 지난 25년간 자본 생산량은 증가했지만 노동 생산량은 변화가 없었고 사실 조금 감소했습니다.

So I want us to start having this discussion now. I know that when I often tell people about this situation, people can be quite dismissive. Well, computers can't really think, they don't emote, they don't understand poetry, we don't really understand how they work. So what? Computers right now can do the things that humans spend most of their time being paid to do, so now's the time to start thinking about how we're going to adjust our social structures and economic structures to be aware of this new reality. Thank you. (Applause)

그래서 이런 토론을 지금부터 시작하고 싶습니다. 제가 이런 상황을 사람들에게 종종 얘기하면 사람들은 아주 무시합니다. 컴퓨터는 진짜 생각할 수 없어. 감정을 드러내지 못하고 시도 이해를 못하지. 우리는 컴퓨터가 어떻게 작동하는지 정말 이해할 수 없어. 그러니 어쩌라고? 컴퓨터는 지금 사람들이 돈받고 하는 일을 할 수 있습니다. 그래서 이제는 우리가 이런 새로운 현실을 인식하도록 사회적, 경제적 구조를 조정하는 법을 생각해봐야 할 때입니다. 감사합니다. (박수)

The New York Times also showed in this article another extraordinary result of deep learning which I'm going to show you now. It shows that computers can listen and understand.

뉴욕 타임즈는 이 기사에서 심화 학습의 또다른 놀라운 결과를 보여줬는데 여러분께 보여드리죠. 컴퓨터가 듣고 이해할 수 있음을 보여줍니다.

This is already impacting us. In the last 25 years, as capital productivity has increased, labor productivity has been flat, in fact even a little bit down.

이것은 이미 우리에게 영향을 주고 있습니다. 지난 25년간 자본 생산량은 증가했지만 노동 생산량은 변화가 없었고 사실 조금 감소했습니다.