Alex Gendler: The Turing test: Can a computer pass for a human?

What is consciousness? Can an artificial machine really think? Does the mind just consist of neurons in the brain, or is there some intangible spark at its core? For many, these have been vital considerations for the future of artificial intelligence. But British computer scientist Alan Turing decided to disregard all these questions in favor of a much simpler one: can a computer talk like a human?

O que é a consciência? Pode uma máquina pensar? Será a mente formada apenas por neurônios no cérebro, ou existe alguma centelha intangível em seu núcleo? Para muitos, essas têm sido questões vitais para o futuro da inteligência artificial. Mas o cientista da computação britânico Alan Turing decidiu desconsiderá-las em favor de uma outra bem mais simples: pode um computador conversar como se fosse um ser humano?

This question led to an idea for measuring aritificial intelligence that would famously come to be known as the Turing test. In the 1950 paper, "Computing Machinery and Intelligence," Turing proposed the following game. A human judge has a text conversation with unseen players and evaluates their responses. To pass the test, a computer must be able to replace one of the players without substantially changing the results. In other words, a computer would be considered intelligent if its conversation couldn't be easily distinguished from a human's.

Com esta questão surgiu a ideia de se medir a inteligência artificial que ficou mundialmente conhecida como o teste de Turing. Em um artigo de 1950 intitulado "Computing Machinery and Intelligence", Turing propunha o seguinte jogo. Uma pessoa, o juiz, troca mensagens de texto com participantes que ela não vê e avalia suas respostas. Para passar no teste, o computador deve ser capaz de substituir um dos jogadores sem alterar os resultados da conversa. Em outras palavras, um computador seria considerado inteligente se ele enganasse o juiz em sua conversa, se passando por um ser humano.

Turing predicted that by the year 2000, machines with 100 megabytes of memory would be able to easily pass his test. But he may have jumped the gun. Even though today's computers have far more memory than that, few have succeeded, and those that have done well focused more on finding clever ways to fool judges than using overwhelming computing power. Though it was never subjected to a real test, the first program with some claim to success was called ELIZA. With only a fairly short and simple script, it managed to mislead many people by mimicking a psychologist, encouraging them to talk more and reflecting their own questions back at them. Another early script PARRY took the opposite approach by imitating a paranoid schizophrenic who kept steering the conversation back to his own preprogrammed obsessions. Their success in fooling people highlighted one weakness of the test. Humans regularly attribute intelligence to a whole range of things that are not actually intelligent. Nonetheless, annual competitions like the Loebner Prize, have made the test more formal with judges knowing ahead of time that some of their conversation partners are machines.

Turing previu que até o ano 2000, máquinas com 100 megabytes de memória seriam capazes de passar no seu teste. Mas ele se precipitou. Os computadores atuais têm muito mais de 100 MB de memória, mas poucos tiveram sucesso no teste, e aqueles que foram aprovados, tinham sido programados para enganar os juízes ao invés de usarem seu alto poder de processamento. Embora nunca tenha participado de um teste real, o primeiro programa com chances de passar no teste foi chamado ELIZA. Com uma programação relativamente curta e simples, conseguiu enganar muitas pessoas imitando um psicólogo, encorajando-as a conversar mais e retornando as próprias perguntas feitas para quem as tinha formulado. Um outro programa chamado PARRY tomou o caminho inverso, imitando um esquizofrênico paranóico que sempre desviava a conversa para os assuntos das suas próprias obsessões. Seu sucesso em enganar as pessoas destacou uma fraqueza do teste. Seres humanos atribuem a inteligência a toda uma gama de coisas que em verdade não são inteligentes. No entanto, competições anuais como o Prêmio Loebner, tornaram o teste mais formal com os juízes sabendo antecipadamente que alguns de seus parceiros de conversa eram máquinas.

But while the quality has improved, many chatbot programmers have used similar strategies to ELIZA and PARRY. 1997's winner Catherine could carry on amazingly focused and intelligent conversation, but mostly if the judge wanted to talk about Bill Clinton. And the more recent winner Eugene Goostman was given the persona of a 13-year-old Ukrainian boy, so judges interpreted its nonsequiturs and awkward grammar as language and culture barriers. Meanwhile, other programs like Cleverbot have taken a different approach by statistically analyzing huge databases of real conversations to determine the best responses. Some also store memories of previous conversations in order to improve over time. But while Cleverbot's individual responses can sound incredibly human, its lack of a consistent personality and inability to deal with brand new topics are a dead giveaway.

Mas apesar da qualidade ter melhorado, muitos programadores têm usado estratégias semelhantes aos de ELIZA e PARRY. Catherine, vencedor em 1997 conseguia manter uma conversação inteligente e focada no tema, principalmente se o juiz quisesse conversar sobre Bill Clinton. E o vencedor mais recente Eugene Goostman simulava um menino ucraniano de 13 anos de idade, então os juízes interpretavam as frases sem sentido e erros gramaticais como barreiras culturais e linguísticas. Entretanto, programas como Cleverbot apresentavam uma abordagem diferente analisando estatisticamente enormes bancos de dados de conversações reais para determinar as melhores respostas. Alguns até armazenam o conteúdo de conversas anteriores a fim de se aperfeiçoar ao longo do tempo. Mas, enquanto as respostas individuais de Cleverbot pareciam as de um humano, sua falta de personalidade consistente e incapacidade de lidar com novos temas permitiram desmascará-lo.

Who in Turing's day could have predicted that today's computers would be able to pilot spacecraft, perform delicate surgeries, and solve massive equations, but still struggle with the most basic small talk? Human language turns out to be an amazingly complex phenomenon that can't be captured by even the largest dictionary. Chatbots can be baffled by simple pauses, like "umm..." or questions with no correct answer. And a simple conversational sentence, like, "I took the juice out of the fridge and gave it to him, but forgot to check the date," requires a wealth of underlying knowledge and intuition to parse. It turns out that simulating a human conversation takes more than just increasing memory and processing power, and as we get closer to Turing's goal, we may have to deal with all those big questions about consciousness after all.

Nos dias de Turing, quem poderia ter previsto que computadores seriam capazes de pilotar naves espaciais, realizar cirurgias delicadas, e resolver difíceis equações, mas ter dificuldades para realizar uma conversação básica? A linguagem humana é um fenômeno incrivelmente complexo que não pode ser apreendido nem mesmo com o melhor dicionário. Os programas podem ser confundidos por pausas simples, como "humm ..." ou perguntas sem respostas corretas. E uma frases simples, como, "Eu retirei o suco da geladeira e dei a ele, mas esqueci de checar a validade", requer uma riqueza de conhecimentos e intuição para ser analisada. O que implica que na simulação de uma conversação humana é preciso muito mais do que aumentar memória e poder de processamento, e à medida que nos aproximamos da meta de Turing, precisamos também lidar com todas aquelas questões sobre a consciência.

Alex Gendler: The Turing test: Can a computer pass for a human?

Alex Gendler: The Turing test: Can a computer pass for a human?

Related talks

Briana Brownell: How does artificial intelligence learn?

Matt Porter and Margaret Hamilton: NASA's first software engineer: Margaret Hamilton

Chiara Decaroli: The high-stakes race to make quantum computers work

Related talks

Briana Brownell: How does artificial intelligence learn?

Matt Porter and Margaret Hamilton: NASA's first software engineer: Margaret Hamilton

Chiara Decaroli: The high-stakes race to make quantum computers work