Noriko Arai: Can a robot pass a university entrance exam?

Today, I'm going to talk about AI and us. AI researchers have always said that we humans do not need to worry, because only menial jobs will be taken over by machines. Is that really true? They have also said that AI will create new jobs, so those who lose their jobs will find a new one. Of course. But the real question is: How many of those who may lose their jobs to AI will be able to land a new one, especially when AI is smart enough to learn better than most of us?

今天我要跟大家聊聊「人工智慧與你我」，人工智慧研究專家常說，人類不需要擔心，因為機器只會取代那些乏味枯燥的粗活。是真的嗎？他們也說過人工智慧可以創造新的工作，因此那些失業的人還是可以再就業。當然。只是問題的癥結是：多少人因為人工智慧失業後能夠真的找到新的工作，尤其當人工智慧已經成熟到比大多數人都能更有效率地學習？

Let me ask you a question: How many of you think that AI will pass the entrance examination of a top university by 2020? Oh, so many. OK. So some of you may say, "Of course, yes!" Now singularity is the issue. And some others may say, "Maybe, because AI already won against a top Go player." And others may say, "No, never. Uh-uh." That means we do not know the answer yet, right? So that was the reason why I started Todai Robot Project, making an AI which passes the entrance examination of the University of Tokyo, the top university in Japan.

讓我問你們一個問題：你們之中認為在 2020 年之前人工智慧將可以通過頂尖大學的入學測驗的人，請舉手。哇，好多。好的。相當多人會認為：「那當然！」現在「人工智慧的奇點」是個熱門話題。也有很多人會說：「有可能，人工智慧都已經打敗過世界頂尖的圍棋高手了。」也有一些人持不同看法：「絕對不可能。嗯，就是這樣。」這說明了我們其實還沒有真正的答案，不是嗎？這就是我啟動「東大機器人計畫」的始末，打造一個人工智慧機器人，通過東京大學的入學考，東大是全日本最頂尖的學府。

This is our Todai Robot. And, of course, the brain of the robot is working in the remote server. It is now writing a 600-word essay on maritime trade in the 17th century. How does that sound?

為您介紹東大機器人。當然，它正接受遠端伺服器遙控著。正在寫一篇 600 字的論文，闡述 17 世紀的海上貿易。聽起來如何？

Why did I take the entrance exam as its benchmark? Because I thought we had to study the performance of AI in comparison to humans, especially on the skills and expertise which are believed to be acquired only by humans and only through education. To enter Todai, the University of Tokyo, you have to pass two different types of exams. The first one is a national standardized test in multiple-choice style. You have to take seven subjects and achieve a high score -- I would say like an 85 percent or more accuracy rate -- to be allowed to take the second stage written test prepared by Todai.

為什麼當初要將入學考試作為一個標竿？因為我想，我們有必要研究人工智慧相較人類的表現，尤其那些有規模跟特殊性的專長領域，向來我們都相信這些技能和知識惟有人類透過教育才能獲得。要考上日本的第一學府，東京大學，必須通過兩項測驗。第一個是日本高考，全國標準化的測驗，全選擇題的題型。你必須在七大科目中都獲得高分—— 大概要達到 85% 以上的正確率—— 才可以獲准進入第二階段筆試，由東大命題。

So let me first explain how modern AI works, taking the "Jeopardy!" challenge as an example. Here is a typical "Jeopardy!" question: "Mozart's last symphony shares its name with this planet." Interestingly, a "Jeopardy!" question always asks, always ends with "this" something: "this" planet, "this" country, "this" rock musician, and so on. In other words, "Jeopardy!" doesn't ask many different types of questions, but a single type, which we call "factoid questions."

先讓我說明一下當代的人工智慧運作的方式，舉一個「Jeopardy!」《危險邊緣》的例子。典型的一道題目如下：「莫札特最後創作的交響曲跟這個星球同名。」有趣的是，《危險邊緣》的例子總是有「這個」字眼在裡頭：「這個」星球、「這個」國家，「這個」搖滾樂手或「這個」什麼什麼。也就是說，《危險邊緣》沒有太多類型的題目，幾乎只有這一類，我們稱作「趣味小問題」。

By the way, do you know the answer? If you do not know the answer and if you want to know the answer, what would you do? You Google, right? Of course. Why not? But you have to pick appropriate keywords like "Mozart," "last" and "symphony" to search. The machine basically does the same. Then this Wikipedia page will be ranked top. Then the machine reads the page. No, uh-uh.

有沒有人剛好知道答案呀？假如我們不知道答案，但是又想要知道，怎麼辦？當然是上網搜尋呀—— 為甚麼不呢? 但要挑對關鍵字，譬如輸入像「莫札特」、「最後」跟「交響曲」去搜尋。基本上機器人也是這麼做。維基百科的頁面就會出現在最上頭。所以機器人就開始「讀」這個頁面嗎？並不是喔──

Unfortunately, none of the modern AIs, including Watson, Siri and Todai Robot, is able to read. But they are very good at searching and optimizing. It will recognize that the keywords "Mozart," "last" and "symphony" are appearing heavily around here. So if it can find a word which is a planet and which is co-occurring with these keywords, that must be the answer. This is how Watson finds the answer "Jupiter," in this case.

不幸的是，所有當代的智慧機器人，無論是 IBM 的華生、蘋果的 Siri 或是東大機器人，它們都沒有「閱讀」的能力。但是它們在搜尋跟得到最佳化結果上很在行。它會找到關鍵字像是「莫札特」、「最後」跟「交響曲」，重複地出現在這一帶。接著繼續尋找屬於星球的詞彙，是跟前述這些關鍵字同時出現的，那鐵定就是答案了。華生就是這樣找到「木星」的。

Our Todai Robot works similarly, but a bit smarter in answering history yes-no questions, like, "'Charlemagne repelled the Magyars.' Is this sentence true or false?" Our robot starts producing a factoid question, like: "Charlemagne repelled [this person type]" by itself. Then, "Avars" but not "Magyars" is ranked top. This sentence is likely to be false. Our robot does not read, does not understand, but it is statistically correct in many cases.

東大機器人的運作方式很接近，但在回答歷史科目的判斷題上表現稍好，例如：查理曼大帝擊敗馬札爾人，對還是錯？機器人自動轉換為一道趣味小問題，變成：「查理曼大帝擊敗了這一種人」。結果最上頭出現了「阿瓦爾人」而非「馬札爾人」。所以這個陳述句很可能是錯誤的。機器人不會閱讀，也不了解，但從統計學角度評估，卻具有高準確度。

For the second stage written test, it is required to write a 600-word essay like this one:

至於第二階段的筆試，受測者必須寫一篇 600 字的論文，如這一道題：

[Discuss the rise and fall of the maritime trade in East and Southeast Asia in the 17th century ...]

（闡述 17 世紀時東亞與東南亞海上貿易的興衰……）如同我稍早展示過的，

and as I have shown earlier, our robot took the sentences from the textbooks and Wikipedia, combined them together, and optimized it to produce an essay without understanding a thing.

我們的機器人將教科書與維基百科的句子併在一起，優化後形成一篇文章，完全不懂字裡行間的意涵。

(Laughter)

（笑聲）

But surprisingly, it wrote a better essay than most of the students.

但是令人驚訝的是，機器人這樣寫出來的文章，居然比大多數的學生好。

(Laughter)

（笑聲）

How about mathematics? A fully automatic math-solving machine has been a dream since the birth of the word "artificial intelligence," but it has stayed at the level of arithmetic for a long, long time. Last year, we finally succeeded in developing a system which solved pre-university-level problems from end to end, like this one. This is the original problem written in Japanese, and we had to teach it 2,000 mathematical axioms and 8,000 Japanese words to make it accept the problems written in natural language. And it is now translating the original problems into machine-readable formulas. Weird, but it is now ready to solve it, I think. Go and solve it. Yes! It is now executing symbolic computation. Even more weird, but probably this is the most fun part for the machine.

那麼數學呢？能全自動處理數學問題的機器人，是大家都夢寐以求的，打從「人工智慧」的概念問世以來就是如此。但是，它曾經長期停滯在算術的階段。去年我們總算成功發展一套系統，可以從頭到尾地解決中等教育程度的數學題目，像這一題。原文是日文。我們必須先教會機器人 2,000 個數學公理，與 8,000 個日文字，才能讓機器人看懂原文的數學題目。它現在正在翻譯原來的題目成為機器人的語言。很怪，不過應該可以開始計算了。開始解題。沒錯！它正在進行符號運算。更怪了，但或許機器人會覺得這個才好玩。

(Laughter)

（笑聲）

Now it outputs a perfect answer, though its proof is impossible to read, even for mathematicians. Anyway, last year our robot was among the top one percent in the second stage written exam in mathematics.

好了，它產出了一個完美的解。儘管連數學家都證實了，完全沒有人看得懂。無論如何，去年我們的機器人在第二階段的數學表現中被歸類在排名前 1% 高分的群組中。

(Applause)

（掌聲）

Thank you.

謝謝。

So, did it enter Todai? No, not as I expected. Why? Because it doesn't understand any meaning. Let me show you a typical error it made in the English test.

所以它最終有沒有考上東大呢？它並沒有如預期的金榜題名。為什麼？因為它根本什麼也不懂。讓我展示一個在英文科的典型錯誤。

[Nate: We're almost at the bookstore. Just a few more minutes. Sunil: Wait. ______ . Nate: Thank you! That always happens ...]

（奈特：我們快到書店了，再過幾分鐘就到了。桑妮：等一下。______。奈特：謝謝！每次都這樣……）兩個人在對話。

Two people are talking. For us, who can understand the situation --

我們都明白發生了什麽—— （選項：1. 我們走了很久的路 2. 我們幾乎快到了

[1. "We walked for a long time." 2. "We're almost there." 3. "Your shoes look expensive." 4. "Your shoelace is untied."]

3. 你的鞋子看起來好昂貴 4. 你的鞋帶鬆了）很明顯地，標準答案是選 4，同意嗎？

it is obvious number four is the correct answer, right? But Todai Robot chose number two, even after learning 15 billion English sentences using deep learning technologies. OK, so now you might understand what I said: modern AIs do not read, do not understand. They only disguise as if they do.

可是東大機器人選 2，就算已經學習了 150 億個英文詞句，還透過深度學習技術。好吧，現在你可能明白我所說的：當代人工智慧沒有辦法閱讀，不能理解。它們只是佯裝成什麽都懂。

This is the distribution graph of half a million students who took the same exam as Todai Robot. Now our Todai Robot is among the top 20 percent, and it was capable to pass more than 60 percent of the universities in Japan -- but not Todai. But see how it is beyond the volume zone of to-be white-collar workers.

這個分佈圖，代表跟東大機器人一起接受入學考試的其他 50 萬名考生的成績。機器人排名其中的前 20%，可以考進日本超過六成的大學── 但就是考不上東大。可是看看被它超越的廣大區塊，所謂的白領階級。

You might think I was delighted. After all, my robot was surpassing students everywhere. Instead, I was alarmed. How on earth could this unintelligent machine outperform students -- our children? Right? I decided to investigate what was going on in the human world. I took hundreds of sentences from high school textbooks and made easy multiple-choice quizzes, and asked thousands of high school students to answer.

你可能會猜想我應該很開心。畢竟我的機器人正在全面性地超越學生。其實不然，我很驚恐。這個一點都不聰明的機器人居然表現得比學生們── 也是我們的孩子們，更好？怎麼可以？我決定深入調查人類世界究竟發生了什麼事。我收集了上百個高中教科書裡頭的詞句，然後編成簡單的選擇題測驗，讓上千位高中生接受測試。

Here is an example:

這是其中一個範例：

[Buddhism spread to ... , Christianity to ... and Oceania, and Islam to ...]

題目都是用他們的母語 ──日文寫的。（題目：______傳播到了大洋洲。

Of course, the original problems are written in Japanese, their mother tongue.

1. 印度教 2. 基督教 3. 伊斯蘭教 4. 佛教）

[ ______ has spread to Oceania. 1. Hinduism 2. Christianity 3. Islam 4. Buddhism ]

顯而易見的答案是基督教，對吧？

Obviously, Christianity is the answer, isn't it? It's written! And Todai Robot chose the correct answer, too. But one-third of junior high school students failed to answer this question. Do you think it is only the case in Japan? I do not think so, because Japan is always ranked among the top in OECD PISA tests, measuring 15-year-old students' performance in mathematics, science and reading every three years.

都包含在題目所給信息裡了。東大機器人也選出了正確的答案。但是有三分之一的國中生無法回答這個問題。你以為這個問題只存在於日本嗎？我不這麼認為，日本總是在國際學生能力評估計劃測驗中名列前茅。那是一套衡量 15 歲青少年在數學、科學與閱讀素質的測驗，每三年考一次。

We have been believing that everybody can learn and learn well, as long as we provide good learning materials free on the web so that they can access through the internet. But such wonderful materials may benefit only those who can read well, and the percentage of those who can read well may be much less than we expected. How we humans will coexist with AI is something we have to think about carefully, based on solid evidence. At the same time, we have to think in a hurry because time is running out.

我們一直相信，每個人都能學習，並且學得出色，只要我們提供高質量的學習資料，這些免費資源，讓他們透過網路取得使用。但這些優質的資料只會讓那些能夠有效閱讀的人受益，而能夠有效閱讀的人所佔的比例，可能遠低於我們的預期。人類要如何與人工智慧共存是我們要謹慎思考的課題，客觀考量各項可靠證據。同時，我們也要加緊腳步思考，因為所剩時間不多了。

Thank you.

謝謝。

(Applause)

（掌聲）

Chris Anderson: Noriko, thank you.

克里斯 · 安德森：紀子，謝謝你。

Noriko Arai: Thank you.

新井紀子：謝謝。

CA: In your talk, you so beautifully give us a sense of how AIs think, what they can do amazingly and what they can't do. But -- do I read you right, that you think we really need quite an urgent revolution in education to help kids do the things that humans can do better than AIs?

克里斯：您方才向我們說明了人工智慧的運作方式。它們可以完美勝任的，以及無法勝任的工作。但是，我的解讀是否是正確的，你認為我們急需教育改革，以協助學子在特定的領域中讓人工智慧難以望其項背？

NA: Yes, yes, yes. Because we humans can understand the meaning. That is something which is very, very lacking in AI. But most of the students just pack the knowledge without understanding the meaning of the knowledge, so that is not knowledge, that is just memorizing, and AI can do the same thing. So we have to think about a new type of education.

紀子：是的。因為我們人類可以理解意義。而這在人工智慧中是相當缺乏的。但是大多數的學生都只會囫圇吞棗，而非深入理解知識，這就只是單純記憶的動作，人工智慧也辦得到。所以我們應該要思考新型態的教育。

CA: A shift from knowledge, rote knowledge, to meaning.

克里斯：從「死記硬背」到「深入理解」的轉變。

NA: Mm-hmm.

紀子：嗯嗯嗯。

CA: Well, there's a challenge for the educators. Thank you so much.

克里斯：我想這是給教育家的一大挑戰，再次感謝您。

NA: Thank you very much. Thank you.

紀子：謝謝，非常謝謝你們。

(Applause)

（掌聲）

This is our Todai Robot. And, of course, the brain of the robot is working in the remote server. It is now writing a 600-word essay on maritime trade in the 17th century. How does that sound?

為您介紹東大機器人。當然，它正接受遠端伺服器遙控著。正在寫一篇 600 字的論文，闡述 17 世紀的海上貿易。聽起來如何？

For the second stage written test, it is required to write a 600-word essay like this one:

至於第二階段的筆試，受測者必須寫一篇 600 字的論文，如這一道題：

[Discuss the rise and fall of the maritime trade in East and Southeast Asia in the 17th century ...]

（闡述 17 世紀時東亞與東南亞海上貿易的興衰……）如同我稍早展示過的，

and as I have shown earlier, our robot took the sentences from the textbooks and Wikipedia, combined them together, and optimized it to produce an essay without understanding a thing.

我們的機器人將教科書與維基百科的句子併在一起，優化後形成一篇文章，完全不懂字裡行間的意涵。

(Laughter)

（笑聲）

But surprisingly, it wrote a better essay than most of the students.

但是令人驚訝的是，機器人這樣寫出來的文章，居然比大多數的學生好。

(Laughter)

（笑聲）

(Laughter)

（笑聲）

(Applause)

（掌聲）

Thank you.

謝謝。

So, did it enter Todai? No, not as I expected. Why? Because it doesn't understand any meaning. Let me show you a typical error it made in the English test.

所以它最終有沒有考上東大呢？它並沒有如預期的金榜題名。為什麼？因為它根本什麼也不懂。讓我展示一個在英文科的典型錯誤。

[Nate: We're almost at the bookstore. Just a few more minutes. Sunil: Wait. ______ . Nate: Thank you! That always happens ...]

（奈特：我們快到書店了，再過幾分鐘就到了。桑妮：等一下。______。奈特：謝謝！每次都這樣……）兩個人在對話。

Two people are talking. For us, who can understand the situation --

我們都明白發生了什麽—— （選項：1. 我們走了很久的路 2. 我們幾乎快到了

[1. "We walked for a long time." 2. "We're almost there." 3. "Your shoes look expensive." 4. "Your shoelace is untied."]

3. 你的鞋子看起來好昂貴 4. 你的鞋帶鬆了）很明顯地，標準答案是選 4，同意嗎？

Here is an example:

這是其中一個範例：

[Buddhism spread to ... , Christianity to ... and Oceania, and Islam to ...]

題目都是用他們的母語 ──日文寫的。（題目：______傳播到了大洋洲。

Of course, the original problems are written in Japanese, their mother tongue.

1. 印度教 2. 基督教 3. 伊斯蘭教 4. 佛教）

[ ______ has spread to Oceania. 1. Hinduism 2. Christianity 3. Islam 4. Buddhism ]

顯而易見的答案是基督教，對吧？

Thank you.

謝謝。

(Applause)

（掌聲）

Chris Anderson: Noriko, thank you.

克里斯 · 安德森：紀子，謝謝你。

Noriko Arai: Thank you.

新井紀子：謝謝。

CA: A shift from knowledge, rote knowledge, to meaning.

克里斯：從「死記硬背」到「深入理解」的轉變。

NA: Mm-hmm.

紀子：嗯嗯嗯。

CA: Well, there's a challenge for the educators. Thank you so much.

克里斯：我想這是給教育家的一大挑戰，再次感謝您。

NA: Thank you very much. Thank you.

紀子：謝謝，非常謝謝你們。

(Applause)

（掌聲）

Noriko Arai: Can a robot pass a university entrance exam?

Noriko Arai: Can a robot pass a university entrance exam?

Related talks

Zeynep Tufekci: Machine intelligence makes human morals more important

Robin Hanson: What would happen if we upload our brains to computers?

Joseph Redmon: How computers learn to recognize objects instantly

Stuart Russell: 3 principles for creating safer AI

Kriti Sharma: How to keep human bias out of AI

David Lee: Why jobs of the future won't feel like work

Related talks

Zeynep Tufekci: Machine intelligence makes human morals more important

Robin Hanson: What would happen if we upload our brains to computers?

Joseph Redmon: How computers learn to recognize objects instantly

Stuart Russell: 3 principles for creating safer AI

Kriti Sharma: How to keep human bias out of AI

David Lee: Why jobs of the future won't feel like work