Alex Gendler: The Turing test: Can a computer pass for a human?

What is consciousness? Can an artificial machine really think? Does the mind just consist of neurons in the brain, or is there some intangible spark at its core? For many, these have been vital considerations for the future of artificial intelligence. But British computer scientist Alan Turing decided to disregard all these questions in favor of a much simpler one: can a computer talk like a human?

Що таке свідомість? Чи може штучний розум справді думати? Чи розум просто складається із нейронів у мозку, чи є якась невловима іскра в його корі? Для багатьох це були вирішальні твердження для майбутнього штучного інтелекту. Та британський науковець Алан Тьюрінг вирішив знехтувати цими питаннями на користь одного простішого: чи може комп'ютер говорити, як людина?

This question led to an idea for measuring aritificial intelligence that would famously come to be known as the Turing test. In the 1950 paper, "Computing Machinery and Intelligence," Turing proposed the following game. A human judge has a text conversation with unseen players and evaluates their responses. To pass the test, a computer must be able to replace one of the players without substantially changing the results. In other words, a computer would be considered intelligent if its conversation couldn't be easily distinguished from a human's.

Це питання веде до ідеї виміряти штучний інтелект, яка, як відомо, втілилась у відомому тесті Тьюринга. В 1980 в статті "Комп'ютерна техніка та інтелект" Тьюринг запропонував наступну гру. Суддя-людина переписується з гравцями, яких він не бачить, та оцінює їх відповіді. Щоб пройти тест, комп'ютер мусить замінити одного з гравців, не змінивши суттєво результати. Іншими словами, комп'ютер вважатиметься розумним, якщо його розмову неможливо буде легко відрізнити від людської.

Turing predicted that by the year 2000, machines with 100 megabytes of memory would be able to easily pass his test. But he may have jumped the gun. Even though today's computers have far more memory than that, few have succeeded, and those that have done well focused more on finding clever ways to fool judges than using overwhelming computing power. Though it was never subjected to a real test, the first program with some claim to success was called ELIZA. With only a fairly short and simple script, it managed to mislead many people by mimicking a psychologist, encouraging them to talk more and reflecting their own questions back at them. Another early script PARRY took the opposite approach by imitating a paranoid schizophrenic who kept steering the conversation back to his own preprogrammed obsessions. Their success in fooling people highlighted one weakness of the test. Humans regularly attribute intelligence to a whole range of things that are not actually intelligent. Nonetheless, annual competitions like the Loebner Prize, have made the test more formal with judges knowing ahead of time that some of their conversation partners are machines.

Тьюрінг передбачив, що до 2000-го року машини зі 100 мегабайтами пам'яті будуть здатні легко пройти цей тест. Та, можливо, він забігав наперед. Хоча у сьогоднішніх комп'ютерів набагато більше пам'яті, лише кільком це вдалось, і ті, які впорались, зосереджувались над розумними способами обдурити суддів, а не над використанням величезної сили комп'ютера. Хоча це ніколи не вважалось справжнім тестом, перша програма із певною заявкою на успіх звалась ЕЛІЗА. Із досить коротким та простим скріптом вона спромоглась обдурити людей, імітуючи психолога, заохочуючи їх більше говорити і ставлячи їм їхні власні питання. Інший досить ранній скріпт ПЕРРІ використав протилежний підхід, імітуючи параноїдального шизофреніка, який керував розмовою відповідно до свого запрограмованого божевілля. Їх успіх в обмані людей показав одну слабкість тесту. Люди приписують інтелекту ряд речей, які, власне, не є частиною інтелекту. Тим не менше, щорічні змагання, такі як премія Лобнера, зробили тест більш формальним із суддями, які знають наперед, що один із учасників - це машина.

But while the quality has improved, many chatbot programmers have used similar strategies to ELIZA and PARRY. 1997's winner Catherine could carry on amazingly focused and intelligent conversation, but mostly if the judge wanted to talk about Bill Clinton. And the more recent winner Eugene Goostman was given the persona of a 13-year-old Ukrainian boy, so judges interpreted its nonsequiturs and awkward grammar as language and culture barriers. Meanwhile, other programs like Cleverbot have taken a different approach by statistically analyzing huge databases of real conversations to determine the best responses. Some also store memories of previous conversations in order to improve over time. But while Cleverbot's individual responses can sound incredibly human, its lack of a consistent personality and inability to deal with brand new topics are a dead giveaway.

Та коли якість була доведена, чимало програмістів використовували стратегії подібні до ЕЛІЗИ та ПЕРРІ. Переможець 1977 року Кетрін змогла вести чудову сконцентровану та інтелектуальну розмову, та здебільшого тоді, як судді хотіли поговорити про Білла Клінтона. Останньому переможцю, Євгену Гутсману, була дана особистість тринадцятирічного українського хлопчика, тож судді зрозуміли його мову та жахливу граматику як мовні та культурні бар'єри. Тим часом інша програма Клевербот використала інший підхід, статистично проаналізувавши величезні бази даних справжніх розмов, щоб визначити найкращі відповіді. Деякі зберігають у пам'яті попередні розмови, щоб з часом удосконалитись. Та хоча окремі відповіді Клевербота можуть звучати неймовірно людськими, відсутність справжньої особистості та нездатність справлятись із цілком новими питаннями - це ознака програшу.

Who in Turing's day could have predicted that today's computers would be able to pilot spacecraft, perform delicate surgeries, and solve massive equations, but still struggle with the most basic small talk? Human language turns out to be an amazingly complex phenomenon that can't be captured by even the largest dictionary. Chatbots can be baffled by simple pauses, like "umm..." or questions with no correct answer. And a simple conversational sentence, like, "I took the juice out of the fridge and gave it to him, but forgot to check the date," requires a wealth of underlying knowledge and intuition to parse. It turns out that simulating a human conversation takes more than just increasing memory and processing power, and as we get closer to Turing's goal, we may have to deal with all those big questions about consciousness after all.

Хто ж міг передбачити у часи Тьюринга, що сьогоднішні комп'ютери будуть здатні керувати космічними кораблями, виконувати складні хірургічні операції, вирішувати величезні задачі, та досі мати проблеми із найпростішою розмовою? Людська мова - це дивовижно складний феномен, який не можна описати навіть найбільшим словником. Чатботи будуть просто збиті з пантелику простими паузами, на зразок "хм..." чи питаннями без правильної відповіді. І просте розмовне речення, як: "Я взяв сік із холодильника і дав це йому, але забув перевірити строк," вимагає величезної кількості базових знань та інтуїції для аналізу. Виявляється, що копіювання людської розмови потребує більшого, аніж збільшення пам'яті та обчислювальної потужності, після того, як ми дістанемось ближче до мети Тьюринга, можливо, після всього ми матимемо справу із такими питаннями як свідомість.

Alex Gendler: The Turing test: Can a computer pass for a human?

Alex Gendler: The Turing test: Can a computer pass for a human?

Related talks

Briana Brownell: How does artificial intelligence learn?

Matt Porter and Margaret Hamilton: NASA's first software engineer: Margaret Hamilton

Chiara Decaroli: The high-stakes race to make quantum computers work

Related talks

Briana Brownell: How does artificial intelligence learn?

Matt Porter and Margaret Hamilton: NASA's first software engineer: Margaret Hamilton

Chiara Decaroli: The high-stakes race to make quantum computers work