Alex Gendler: The Turing test: Can a computer pass for a human?

What is consciousness? Can an artificial machine really think? Does the mind just consist of neurons in the brain, or is there some intangible spark at its core? For many, these have been vital considerations for the future of artificial intelligence. But British computer scientist Alan Turing decided to disregard all these questions in favor of a much simpler one: can a computer talk like a human?

هوشیاری چیست؟ آیا واقعا یک ماشین مصنوعی می‌تواند فکر کند؟ آیا ذهن فقط شامل عصب‌های درون مغز است٬ یا بارقه‌ای باور نکردنی در مرکز آن است؟ برای بسیاری٬ این‌ها مسائلی حیاتی برای آینده هوش مصنوعی بوده است. اما دانشمند علوم کامپیوتر بریتانیایی الن تورینگ تصمیم گرفت همه‌ی این سوالات را برای حل سوالی بسیار ساده تر نادیده بگیرد: آیا یک رایانه می‌تواند مثل یک انسان صحبت کند؟

This question led to an idea for measuring aritificial intelligence that would famously come to be known as the Turing test. In the 1950 paper, "Computing Machinery and Intelligence," Turing proposed the following game. A human judge has a text conversation with unseen players and evaluates their responses. To pass the test, a computer must be able to replace one of the players without substantially changing the results. In other words, a computer would be considered intelligent if its conversation couldn't be easily distinguished from a human's.

این سوال به ایده‌ای برای اندازه گیری هوش مصنوعی منجر شد که با نام آزمون تورینگ شهرت یافت. در مقاله‌ی سال ۱۹۵۰ ٬ «ماشین‌های محاسبه‌گر و هوش»٬ تورینگ این بازی را ارائه کرد. یک قاضی انسان با بازیکنانی که آنها را ندیده است مکالمه نوشتاری انجام می‌دهد و بازخوردهای آنها را ارزیابی می‌کند. برای قبولی در آزمون٬ یک رایانه باید بتواند جای یکی از بازیکنان را بگیرد بدون اینکه نتایج تغییر قابل ملاحظه‌ای داشته باشند. به بیان دیگر٬ رایانه‌ای هوشمند شناخته می‌شود که نتوان صحبت‌های او را به راحتی از یک انسان تشخیص داد.

Turing predicted that by the year 2000, machines with 100 megabytes of memory would be able to easily pass his test. But he may have jumped the gun. Even though today's computers have far more memory than that, few have succeeded, and those that have done well focused more on finding clever ways to fool judges than using overwhelming computing power. Though it was never subjected to a real test, the first program with some claim to success was called ELIZA. With only a fairly short and simple script, it managed to mislead many people by mimicking a psychologist, encouraging them to talk more and reflecting their own questions back at them. Another early script PARRY took the opposite approach by imitating a paranoid schizophrenic who kept steering the conversation back to his own preprogrammed obsessions. Their success in fooling people highlighted one weakness of the test. Humans regularly attribute intelligence to a whole range of things that are not actually intelligent. Nonetheless, annual competitions like the Loebner Prize, have made the test more formal with judges knowing ahead of time that some of their conversation partners are machines.

تورینگ پیش بینی کرد که تا سال ۲۰۰۰ ٬ ماشین‌هایی با ۱۰۰ مگا بایت حافظه می‌توانند به سادگی در آزمون او پذیرفته شوند. اما ممکن است زیادی تند رفته باشد. با وجود اینکه امروزه رایانه‌ها خیلی بیشتر از آن مقدار حافظه دارند٬ تعداد کمی قبول شده‌اند٬ و آنهایی هم که قبول شده‌اند بیشتر روی پیدا کردن راهی برای فریب دادن داوران تمرکز کرده‌اند تا پیشبرد توانایی‌های رایانه. هرچند این آزمون هرگز به عنوان یک امتحان واقعی شناخته نشد٬ اولین برنامه‌ای که ادعای موفقیت کرد «ELIZA» نام داشت. با متنی نسبتا کوتاه و ساده٬ عده‌ی زیادی را با تقلید از یک روانشناس فریب داد٬ و آنها را به بیشتر حرف زدن تشویق کرد و سوالات آنها را به خودشان انعکاس داد. روشی دیگر در همان زمان به نام «PARRY» راهی کاملا برعکس را در پیش گرفت، و از یک مجنون مبتلا به شیزوفرنی تقلید کرد که دائما مکالمه را به موضوعات برنامه ریزی شده‌ی خودش هدایت می‌کرد. موفقیت آنها در فریب دادن افراد یکی از نقاط ضعف آزمون را آشکار ساخت. انسان‌ها معمولا هوش را با چیزهایی توصیف می‌کنند که در واقع هوشمند نیستند. به هرحال٬ مسابقات سالانه مثل جایزه لوبنر، به آزمون شکل رسمی تری داده‌، با داورانی که با گذشت زمان متوجه می‌شوند که بعضی از طرف‌های صحبت‌های آنها ماشین‌ها هستند.

But while the quality has improved, many chatbot programmers have used similar strategies to ELIZA and PARRY. 1997's winner Catherine could carry on amazingly focused and intelligent conversation, but mostly if the judge wanted to talk about Bill Clinton. And the more recent winner Eugene Goostman was given the persona of a 13-year-old Ukrainian boy, so judges interpreted its nonsequiturs and awkward grammar as language and culture barriers. Meanwhile, other programs like Cleverbot have taken a different approach by statistically analyzing huge databases of real conversations to determine the best responses. Some also store memories of previous conversations in order to improve over time. But while Cleverbot's individual responses can sound incredibly human, its lack of a consistent personality and inability to deal with brand new topics are a dead giveaway.

اما با وجود افزایش کیفیت٬ تعداد زیادی از برنامه نویسان از روش‌های مشابه ELIZA و PARRY استفاده کرده‌اند. برنده سال ۱۹۹۷ کاترین می‌توانست مکالمه‌ی متمرکز و هوشمند بسیار جالبی انجام دهد٬ اما بیشتر وقتی که داور می‌خواست درباره بیل کلینتون صحبت کند. و برنده اخیر Eugene Goostman شخصیتی مشابه یک پسر ۱۳ ساله‌ی اکراینی داشت٬ پس داور زبان غیر منطقی و عجیب او را ناشی از تفاوت‌های زبان و فرهنگ تفسیر می‌کرد. در همین حین٬ برنامه‌های دیگری مانند Cleverbot روش‌های دیگری در پیش گرفته‌اند و یک بانک داده عظیم از مکالمات واقعی را به صورتی هدفمند تحلیل می‌کنند تا بهترین واکنش‌ها را شناسایی کنند. بعضی هم مکالمات پیشین را به حافظه می‌سپارند و با گذشت زمان پیشرفت می‌کنند. اما با وجود اینکه واکنش‌های متفاوت Cleverbot می‌تواند خیلی انسانی به نظر برسد اما عدم وجود یک شخصیت ثابت و عدم توانایی در انجام مکالمات با موضوعات کاملا جدید مانع از توفیق یافتن آنها شده است.

Who in Turing's day could have predicted that today's computers would be able to pilot spacecraft, perform delicate surgeries, and solve massive equations, but still struggle with the most basic small talk? Human language turns out to be an amazingly complex phenomenon that can't be captured by even the largest dictionary. Chatbots can be baffled by simple pauses, like "umm..." or questions with no correct answer. And a simple conversational sentence, like, "I took the juice out of the fridge and gave it to him, but forgot to check the date," requires a wealth of underlying knowledge and intuition to parse. It turns out that simulating a human conversation takes more than just increasing memory and processing power, and as we get closer to Turing's goal, we may have to deal with all those big questions about consciousness after all.

چه کسی در زمان تورینگ می‌توانست تصور کند که روزی رایانه‌ها می‌توانند فضاپیماها را خلبانی کنند٬ جراحی‌های ظریف را انجام دهند٬ و معادلات پرحجم را انجام دهند٬ اما هنوز با ابتدایی ترین مکالمه‌ها دست و پنجه نرم کنند؟ مشخص شده است که زبان انسان‌ها یک پدیده‌ی شگفت انگیز و پیچیده است که حتی با بزرگترین فرهنگ لغات هم نمی‌توان آن را ثبت کرد. Chatbotها می‌توانند با مکث‌های ساده مثل «آم...» گیج شوند یا سولاتی که جواب صحیح ندارند آنها را سردرگم می‌کنند. و یک جمله‌ی مکالمه‌ای ساده مثل٬ «آبمیوه را از یخچال بیرون آوردم و به او دادم٬ اما یادم رفت تاریخ را چک کنم٬» برای تحلیل به دانش پیش زمینه‌ی غنی و بینش کافی نیاز دارد. مشخص شده که شبیه سازی یک مکالمه‌ی انسانی، به چیزی بیش از تنها بالابردن حافظه و توانایی تحلیل نیاز دارد٬ و با نزدیک تر شدن ما به هدف آزمون تورینگ٬ شاید در آخر مجبور شویم دوباره با همان پرسش بزرگ درباره هوشیاری سر و کله بزنیم.

Alex Gendler: The Turing test: Can a computer pass for a human?

Alex Gendler: The Turing test: Can a computer pass for a human?

Related talks

Briana Brownell: How does artificial intelligence learn?

Matt Porter and Margaret Hamilton: NASA's first software engineer: Margaret Hamilton

Chiara Decaroli: The high-stakes race to make quantum computers work

Related talks

Briana Brownell: How does artificial intelligence learn?

Matt Porter and Margaret Hamilton: NASA's first software engineer: Margaret Hamilton

Chiara Decaroli: The high-stakes race to make quantum computers work