Alex Gendler: The Turing test: Can a computer pass for a human?

What is consciousness? Can an artificial machine really think? Does the mind just consist of neurons in the brain, or is there some intangible spark at its core? For many, these have been vital considerations for the future of artificial intelligence. But British computer scientist Alan Turing decided to disregard all these questions in favor of a much simpler one: can a computer talk like a human?

什么是意识？人造的机器真的能思考吗？人类的大脑到底仅仅是一个神经元的集体还是是一种神秘意志的存在？对很多人来说，这些都是关于未来人工智能的一些重要考虑因素。但是对于一位英国电脑科学家，阿兰-图灵，而言，他关心一个更简单的问题：电脑可以如人类一样地交流吗？

This question led to an idea for measuring aritificial intelligence that would famously come to be known as the Turing test. In the 1950 paper, "Computing Machinery and Intelligence," Turing proposed the following game. A human judge has a text conversation with unseen players and evaluates their responses. To pass the test, a computer must be able to replace one of the players without substantially changing the results. In other words, a computer would be considered intelligent if its conversation couldn't be easily distinguished from a human's.

这个问题让他有了测量人工智能的想法，之后，这便演变为著名的“图灵测验” 。在1950年，图灵在 “计算机器与智能”的报告里提出了一个游戏：一个判官和看不到的选手用短信交流，然后评价他们的回答。要通过考试，电脑必须能在没有改变结果的情况下，代替其中一位选手。换句话说，如果无轻易法分辨一台电脑与一个人的区别，这台电脑就是“聪明”的。

Turing predicted that by the year 2000, machines with 100 megabytes of memory would be able to easily pass his test. But he may have jumped the gun. Even though today's computers have far more memory than that, few have succeeded, and those that have done well focused more on finding clever ways to fool judges than using overwhelming computing power. Though it was never subjected to a real test, the first program with some claim to success was called ELIZA. With only a fairly short and simple script, it managed to mislead many people by mimicking a psychologist, encouraging them to talk more and reflecting their own questions back at them. Another early script PARRY took the opposite approach by imitating a paranoid schizophrenic who kept steering the conversation back to his own preprogrammed obsessions. Their success in fooling people highlighted one weakness of the test. Humans regularly attribute intelligence to a whole range of things that are not actually intelligent. Nonetheless, annual competitions like the Loebner Prize, have made the test more formal with judges knowing ahead of time that some of their conversation partners are machines.

图灵预计在2000年，拥有100兆字节内存的机器会轻易地通过图灵测试。但是图灵预计错了。虽然现代的电脑具备更多的内存，没有几个通过了图灵测试。那些通过图灵测试的电脑并不是用了压倒性的计算能力，而是用了巧妙的手段来迷惑判官。虽然没有经过正式的考试， ELIZA成了历史上第一个有资格成功的程序。仅用了一个十分简短的脚本， ELIZA成功地迷惑了很多人，模仿心理专家，鼓励他们多说话，同时也发问他们的问题。另一个早期程式脚本，PARRY, 运用相反的方式，模仿了偏执的精神分裂症患者一直将话题转移回他自己预设的困扰。它们玩弄人们的成功凸显出测试的缺点。人类常常把很多并不聪明的事物归类于“聪明”。尽管如此，年度竞赛比如洛伯纳奖，使测试变得更规范，让判官们事先知道有些对话选手是机器。

But while the quality has improved, many chatbot programmers have used similar strategies to ELIZA and PARRY. 1997's winner Catherine could carry on amazingly focused and intelligent conversation, but mostly if the judge wanted to talk about Bill Clinton. And the more recent winner Eugene Goostman was given the persona of a 13-year-old Ukrainian boy, so judges interpreted its nonsequiturs and awkward grammar as language and culture barriers. Meanwhile, other programs like Cleverbot have taken a different approach by statistically analyzing huge databases of real conversations to determine the best responses. Some also store memories of previous conversations in order to improve over time. But while Cleverbot's individual responses can sound incredibly human, its lack of a consistent personality and inability to deal with brand new topics are a dead giveaway.

虽然总体质量上升了，很多聊天机器人还是局用了与ELIZA和PARRY类似的技巧。 1997年的获胜者，凯瑟琳能够进行惊人般的聪慧会话，但是这是在讨论与比尔·克林顿相关的话题。最近的获奖者，尤金·古斯特曼拥有13岁的乌克兰少年的个性，使判官将他的别扭语法理解为语言及文化的阻碍。在这同时，其他的程式如 Cleverbot，采取了另一个方式通过分析与统计巨大的真实对话数据，决定最好的回答方式。有一些还存储先前对话的记忆，以便长期改善。但尽管Cleverbot 自己的答复听起来非常像人类，对始终如一的个性的缺乏及无法回答崭新的话题则完全暴露了它。

Who in Turing's day could have predicted that today's computers would be able to pilot spacecraft, perform delicate surgeries, and solve massive equations, but still struggle with the most basic small talk? Human language turns out to be an amazingly complex phenomenon that can't be captured by even the largest dictionary. Chatbots can be baffled by simple pauses, like "umm..." or questions with no correct answer. And a simple conversational sentence, like, "I took the juice out of the fridge and gave it to him, but forgot to check the date," requires a wealth of underlying knowledge and intuition to parse. It turns out that simulating a human conversation takes more than just increasing memory and processing power, and as we get closer to Turing's goal, we may have to deal with all those big questions about consciousness after all.

在图灵的时代里，谁可能预料到今日的电脑能够驾驶宇宙飞船，能操控精巧的手术，还能解答大量的方程，可仍与最基本的对话斗争？人类语言是样如何巧妙及复杂的现象，连最大最广泛的字典都无法记录。聊天机器人会被简单的停顿，如“额...”，或被没有正确回答的问题而弄得团团转。一个简单的对话句子，比如“我从冰箱里拿出了果汁，然后给了他，但忘了查明日子“，需要丰富的潜在知识与直觉来解析。事件证明若要模仿人类对话，比单纯地增加记忆力和运算量更加复杂。在接近图灵的目标的同时，我们也终究需考虑下与“意识”相关的重大问题。

Alex Gendler: The Turing test: Can a computer pass for a human?

Alex Gendler: The Turing test: Can a computer pass for a human?

Related talks

Briana Brownell: How does artificial intelligence learn?

Matt Porter and Margaret Hamilton: NASA's first software engineer: Margaret Hamilton

Chiara Decaroli: The high-stakes race to make quantum computers work

Related talks

Briana Brownell: How does artificial intelligence learn?

Matt Porter and Margaret Hamilton: NASA's first software engineer: Margaret Hamilton

Chiara Decaroli: The high-stakes race to make quantum computers work