Cathy O'Neil: The era of blind faith in big data must end

Algorithms are everywhere. They sort and separate the winners from the losers. The winners get the job or a good credit card offer. The losers don't even get an interview or they pay more for insurance. We're being scored with secret formulas that we don't understand that often don't have systems of appeal. That begs the question: What if the algorithms are wrong?

Алгоритми є повсюди. Вони сортують і відрізняють переможців від переможених. Переможці отримують роботу чи вигідні умови оформлення кредитки. Переможені не доходять навіть до співбесіди, або платять більше за страхування. Нас обчислюють секретними формулами, які ми не розуміємо, і до яких часто не можна подати апеляційні скарги. Тому виникає питання: а якщо припустити, що алгоритми неправильні?

To build an algorithm you need two things: you need data, what happened in the past, and a definition of success, the thing you're looking for and often hoping for. You train an algorithm by looking, figuring out. The algorithm figures out what is associated with success. What situation leads to success?

Для побудови алгоритму потрібні дві речі: потрібні дані про те, що сталося у минулому, і визначення успіху, те, чого ви прагнете і на що часто сподіваєтеся. Ви навчаєте алгоритм, розмірковуючи, з'ясовуючи. Алгоритм з'ясовує, що асоціюється із успіхом. Яка ситуація призводить до успіху?

Actually, everyone uses algorithms. They just don't formalize them in written code. Let me give you an example. I use an algorithm every day to make a meal for my family. The data I use is the ingredients in my kitchen, the time I have, the ambition I have, and I curate that data. I don't count those little packages of ramen noodles as food.

Усі люди вживають алгоритми. Вони просто не записують їх у вигляді коду. Я наведу вам приклад. Я щодня вживаю алгоритм, щоб приготувати їсти для сім'ї. Дані, що я використовую, це інгредієнти в мене на кухні, скільки часу я маю, наскільки я захоплена, і я - куратор цих даних. Я не зараховую маленькі пакетики локшини рамен до їжі.

(Laughter)

(Сміх)

My definition of success is: a meal is successful if my kids eat vegetables. It's very different from if my youngest son were in charge. He'd say success is if he gets to eat lots of Nutella. But I get to choose success. I am in charge. My opinion matters. That's the first rule of algorithms.

Ось моє визначення успіху: страва успішна, якщо мої діти їдять овочі. Дайте вирішувати моєму молодшому синові, і все буде інакше. Для нього успіх - це якщо вдається з'їсти багато Нутелли. Але я визначаю, що таке успіх. Я вирішую. Моя точка зору має значення. Ось таким є перше правило алгоритмів.

Algorithms are opinions embedded in code. It's really different from what you think most people think of algorithms. They think algorithms are objective and true and scientific. That's a marketing trick. It's also a marketing trick to intimidate you with algorithms, to make you trust and fear algorithms because you trust and fear mathematics. A lot can go wrong when we put blind faith in big data.

Алгоритми - це точки зору, вбудовані у код. Це дуже відрізняється від того, що більшість людей думає про алгоритми. Вони думають, що алгоритми об'єктивні, правдиві і науково обґрунтовані. Це маркетингові хитрощі. Це також будуть маркетингові хитрощі, якщо вам будуть погрожувати алгоритмами, будуть примушувати вас довіряти алгоритмам та боятися їх, бо ви довіряєте математиці та боїтеся її. Чимало речей може піти не так, як треба, коли ми сліпо довіряємо великим даним.

This is Kiri Soares. She's a high school principal in Brooklyn. In 2011, she told me her teachers were being scored with a complex, secret algorithm called the "value-added model." I told her, "Well, figure out what the formula is, show it to me. I'm going to explain it to you." She said, "Well, I tried to get the formula, but my Department of Education contact told me it was math and I wouldn't understand it."

Це Кірі Соарс. Вона - директор школи старших класів у Брукліні. У 2011 р. вона розповіла мені, що її вчителів оцінювали за складним секретним алгоритмом під назвою "модель з розширеними функціями". Я сказала їй: "З'ясуй, що це за формула, покажи її мені. Я тобі її поясню". Вона сказала: "Я намагалася отримати формулу, але моя знайома у міносвіти сказала мені, що то математика, і що мені цього не зрозуміти".

It gets worse. The New York Post filed a Freedom of Information Act request, got all the teachers' names and all their scores and they published them as an act of teacher-shaming. When I tried to get the formulas, the source code, through the same means, I was told I couldn't. I was denied. I later found out that nobody in New York City had access to that formula. No one understood it. Then someone really smart got involved, Gary Rubinstein. He found 665 teachers from that New York Post data that actually had two scores. That could happen if they were teaching seventh grade math and eighth grade math. He decided to plot them. Each dot represents a teacher.

Далі буде гірше. "Нью-Йорк Пост" надіслала запит згідно із Законом про свободу інформації, отримала імена усіх вчителів та усі їх оцінки, і потім вони опублікували це задля присоромлення вчителів. Коли я намагалася тими ж методами одержати формули, початковий код, мені сказали, що я не можу цього зробити. Мені відмовили. Пізніше я дізналася, що ніхто у місті Нью-Йорк не мав доступу до цієї формули. Ніхто її не розумів. Потім до цього долучилася одна мудра людина, Гері Рубінштейн. Він знайшов 665 вчителів з тої статті у "Нью-Йорк Пост", вчителів, що, власне, мали дві оцінки. Так могло статися, якщо вони викладали математику у сьомому класі і математику у восьмому. Він вирішив відобразити їх дані. Кожна крапка репрезентує вчителя.

(Laughter)

(Сміх)

What is that?

Що це таке?

(Laughter)

(Сміх)

That should never have been used for individual assessment. It's almost a random number generator.

Це ніколи не слід було використовувати для індивідуальної оцінки. Це майже як генератор випадкових чисел.

(Applause)

(Оплески)

But it was. This is Sarah Wysocki. She got fired, along with 205 other teachers, from the Washington, DC school district, even though she had great recommendations from her principal and the parents of her kids.

Але це було використано. Це - Сара Висоцкі. Її звільнили, разом із 205 іншими вчителями, зі шкільного району м.Вашингтон в окрузі Колумбія, хоча вона мала прекрасні рекомендації від її директора та батьків її дітей.

I know what a lot of you guys are thinking, especially the data scientists, the AI experts here. You're thinking, "Well, I would never make an algorithm that inconsistent." But algorithms can go wrong, even have deeply destructive effects with good intentions. And whereas an airplane that's designed badly crashes to the earth and everyone sees it, an algorithm designed badly can go on for a long time, silently wreaking havoc.

Я знаю, про що зараз думає багато із вас, зокрема фахівці з обробки даних і штучного інтелекту. Ви думаєте: "Ну, я б ніколи не створив алгоритм з такими протиріччями". Але алгоритми можуть піти не за планом, навіть мати надзвичайно нищівні наслідки, незважаючи на добрі наміри. В той час, як літак, що був погано спроектований, врізається у землю, і всі це бачать, алгоритм, що був погано розроблений, може довго функціонувати і тихенько завдавати шкоди.

This is Roger Ailes.

Це - Роджер Ейлс.

(Laughter)

(Сміх)

He founded Fox News in 1996. More than 20 women complained about sexual harassment. They said they weren't allowed to succeed at Fox News. He was ousted last year, but we've seen recently that the problems have persisted. That begs the question: What should Fox News do to turn over another leaf?

Він заснував Fox News у 1996 р. Понад 20 жінок поскаржилися на сексуальні домагання. Вони казали, що їм не дозволяли досягати успіхів у Fox News. Минулого року його вигнали, але ми нещодавно побачили, що проблеми все одно існують. Виникає питання: що повинна зробити Fox News, щоб почати нову сторінку?

Well, what if they replaced their hiring process with a machine-learning algorithm? That sounds good, right? Think about it. The data, what would the data be? A reasonable choice would be the last 21 years of applications to Fox News. Reasonable. What about the definition of success? Reasonable choice would be, well, who is successful at Fox News? I guess someone who, say, stayed there for four years and was promoted at least once. Sounds reasonable. And then the algorithm would be trained. It would be trained to look for people to learn what led to success, what kind of applications historically led to success by that definition. Now think about what would happen if we applied that to a current pool of applicants. It would filter out women because they do not look like people who were successful in the past.

А якщо б вони замість свого процесу найму працівників вживали алгоритм машинного навчання? Непогана ідея, правда? Подумайте про це. Дані, які в нас були б дані? Резонно розглянути відгуки на вакансії у Fox News за останній 21 рік. Резонно. А як ми визначимо успіх? Резонно було б обрати, ну, хто є успішним у Fox News? Скажімо, та людина, що пробула там чотири роки, і яка хоч раз отримала підвищення. Резонне визначення. А потім ми б навчали алгоритм. Його б навчали шукати людей, вивчати, що призвело до успіху, якого роду відгуки про вакансії призводили до успіху за цим визначенням. Подумайте, що сталося би по відношенню до теперішнього банку даних про кандидатів. Алгоритм відфільтрував би жінок, бо вони не виглядають, як люди, що були успішними у минулому.

Algorithms don't make things fair if you just blithely, blindly apply algorithms. They don't make things fair. They repeat our past practices, our patterns. They automate the status quo. That would be great if we had a perfect world, but we don't. And I'll add that most companies don't have embarrassing lawsuits, but the data scientists in those companies are told to follow the data, to focus on accuracy. Think about what that means. Because we all have bias, it means they could be codifying sexism or any other kind of bigotry.

Алгоритми не забезпечують справедливість, якщо ви застосовуєте алгоритми безтурботно і всліпу. Це не гарантія справедливості. Вони повторюють наші минулі методики роботи, наші шаблони. Вони автоматизують статус-кво. Як було б добре, якщо б ми жили в ідеальному світі, але ми в ньому не живемо. Додам, що більшість компаній не має прикрих правових спорів, але науковцям з даних у тих компаніях кажуть слідкувати за даними, концентруватися на точності. Подумайте, що це означає. Оскільки усі ми маємо упередження, вони можуть кодувати сексизм чи інший вид нетерпимості.

Thought experiment, because I like them: an entirely segregated society -- racially segregated, all towns, all neighborhoods and where we send the police only to the minority neighborhoods to look for crime. The arrest data would be very biased. What if, on top of that, we found the data scientists and paid the data scientists to predict where the next crime would occur? Minority neighborhood. Or to predict who the next criminal would be? A minority. The data scientists would brag about how great and how accurate their model would be, and they'd be right.

Інтелектуальний експеримент, бо вони мені подобаються: повністю сегреговане суспільство - расова сегрегація в усіх містах, усіх кварталах, і поліцію посилають лиш до кварталів, де проживає меншість, щоб шукати там злочинців. Дані про арешти були б дуже упередженими. А якщо, окрім того, ми знайшли б науковців з даних і платили б науковцям за передбачення, де буде скоєно наступний злочин? У кварталі, де проживає меншість. Чи передбачити, хто буде наступним злочинцем? Людина з меншості. Науковці хвалилися б про те, наскільки чудовою і точною є їх модель, і вони були б праві.

Now, reality isn't that drastic, but we do have severe segregations in many cities and towns, and we have plenty of evidence of biased policing and justice system data. And we actually do predict hotspots, places where crimes will occur. And we do predict, in fact, the individual criminality, the criminality of individuals. The news organization ProPublica recently looked into one of those "recidivism risk" algorithms, as they're called, being used in Florida during sentencing by judges. Bernard, on the left, the black man, was scored a 10 out of 10. Dylan, on the right, 3 out of 10. 10 out of 10, high risk. 3 out of 10, low risk. They were both brought in for drug possession. They both had records, but Dylan had a felony but Bernard didn't. This matters, because the higher score you are, the more likely you're being given a longer sentence.

В реальному житті немає таких крайнощів, але ми маємо суттєву сегрегацію у великих і малих містах, і маємо досить доказів щодо упередженості поліції і судової системи. І ми справді передбачаємо гарячі точки, місця, де буде скоєно злочини. І це факт, що ми передбачаємо індивідуальні злочинні дії, злочинність окремих людей. Інформагентство ProPublica нещодавно провело розслідування щодо одного з алгоритмів "ризику рецидивізму", так вони називаються, що використовують судді у Флориді, коли виносять вирок. Бернард, зліва, темношкірий, отримав рейтинг 10 з 10. Ділан, справа, 3 з 10. 10 з 10, високий ризик. 3 з 10, низький ризик. Їх обох заарештували за зберігання наркотиків. В них кримінальне минуле, але Ділан скоїв тяжкий злочин, а Бернард - ні. Це має значення, бо чим вищий в тебе ризик, тим ймовірніше, що ти отримаєш довший термін покарання.

What's going on? Data laundering. It's a process by which technologists hide ugly truths inside black box algorithms and call them objective; call them meritocratic. When they're secret, important and destructive, I've coined a term for these algorithms: "weapons of math destruction."

Що ж відбувається? Відмивання даних. Це процес, коли технологи ховають неприємну правду всередині алгоритмів типу "чорний ящик" і називають їх об'єктивними; називають їх меритократичними. Коли ці алгоритми секретні, важливі та нищівні, я створила для них термін: "зброя математичного знищення".

(Laughter)

(Сміх)

(Applause)

(Оплески)

They're everywhere, and it's not a mistake. These are private companies building private algorithms for private ends. Even the ones I talked about for teachers and the public police, those were built by private companies and sold to the government institutions. They call it their "secret sauce" -- that's why they can't tell us about it. It's also private power. They are profiting for wielding the authority of the inscrutable. Now you might think, since all this stuff is private and there's competition, maybe the free market will solve this problem. It won't. There's a lot of money to be made in unfairness.

Вони повсюди, і це не помилково. Це приватні компанії, що будують приватні алгоритми для приватного зиску. Навіть приклади, що я навела, для вчителів і державної поліції, приватні компанії побудували їх і продали державним установам. Вони кажуть, що це їх "секретний соус", тому вони не можуть розповісти нам про нього. Це також вплив приватних інтересів. Вони отримують зиск, маючи владу над незбагненним. Позаяк це все приватні компанії, ви можете припустити, що існує конкуренція, можливо, вільний ринок вирішить цю проблему. Ні, не вирішить. На несправедливості можна заробити чимало грошей.

Also, we're not economic rational agents. We all are biased. We're all racist and bigoted in ways that we wish we weren't, in ways that we don't even know. We know this, though, in aggregate, because sociologists have consistently demonstrated this with these experiments they build, where they send a bunch of applications to jobs out, equally qualified but some have white-sounding names and some have black-sounding names, and it's always disappointing, the results -- always.

До того ж, ми не є економічними раціональними агентами. У нас у всіх є упередження. Ми всі до певної міри нетерпимі расисти, хоч нам це і не подобається, ми самі не знаємо, до якої міри. Однак ми знаємо, що так загалом і є, бо соціологи систематично демонструють це у експериментах, що вони проводять, коли вони надсилають низку відгуків на вакансії, однакові кваліфікації, але у деяких "білі" імена, а в інших імена, як у темношкірих, і результати завжди невтішні, завжди.

So we are the ones that are biased, and we are injecting those biases into the algorithms by choosing what data to collect, like I chose not to think about ramen noodles -- I decided it was irrelevant. But by trusting the data that's actually picking up on past practices and by choosing the definition of success, how can we expect the algorithms to emerge unscathed? We can't. We have to check them. We have to check them for fairness.

Отже, ми маємо упередження, і ми вбудовуємо ці упередження в алгоритми, обираючи, які дани потрібно збирати, так само, як я вирішила не думати про локшину рамен - я вирішила, що це малозначуще. Але коли ми довіряємо даним, що вловлюють практику, що склалася, і обираємо визначення успіху, як ми можемо очікувати, що алгоритми будуть без несправностей? Не можемо. Ми повинні перевіряти їх. Перевіряти їх на справедливість.

The good news is, we can check them for fairness. Algorithms can be interrogated, and they will tell us the truth every time. And we can fix them. We can make them better. I call this an algorithmic audit, and I'll walk you through it.

На щастя, ми можемо перевіряти їх на справедливість. Алгоритми можна розпитувати, і вони щоразу казатимуть нам правду. І ми можемо виправити їх. Ми можемо покращити їх. Я називаю це "алгоритмічним аудитом", і я вам зараз його поясню.

First, data integrity check. For the recidivism risk algorithm I talked about, a data integrity check would mean we'd have to come to terms with the fact that in the US, whites and blacks smoke pot at the same rate but blacks are far more likely to be arrested -- four or five times more likely, depending on the area. What is that bias looking like in other crime categories, and how do we account for it?

По-перше, перевірка цілісності даних. Повертаючись до алгоритму риску рецидивізму, перевірка цілісності даних означала б, що нам довелося б змиритися із фактом, що у США білі і темношкірі обкурені однаково, однак темношкірих заарештовують набагато частіше - у чотири-п'ять разів частіше, залежно від району. Як ця упередженість виглядає в інших кримінальних категоріях, і як ми приймаємо її до уваги?

Second, we should think about the definition of success, audit that. Remember -- with the hiring algorithm? We talked about it. Someone who stays for four years and is promoted once? Well, that is a successful employee, but it's also an employee that is supported by their culture. That said, also it can be quite biased. We need to separate those two things. We should look to the blind orchestra audition as an example. That's where the people auditioning are behind a sheet. What I want to think about there is the people who are listening have decided what's important and they've decided what's not important, and they're not getting distracted by that. When the blind orchestra auditions started, the number of women in orchestras went up by a factor of five.

По-друге, нам слід подумати про визначення успіху, проводити аудит визначення. Пригадуєте алгоритм щодо прийняття на роботу? Той, хто утримується на роботі чотири роки і раз отримує підвищення? Ну так, це успішний працівник, але це також працівник, котрого підтримує організаційна культура. Однак і тут може бути багато упередження. Нам треба розрізняти тих дві речі. Давайте брати приклад з прослуховування всліпу на роль в окрестрі, Це коли люди на прослуховуванні є за листом паперу. На чому я хочу тут зосередитись: люди, котрі прослуховують кандидатів, вирішили, що важливе, і вирішили, що неважливе, і їх це не відволікає. Коли розпочалися прослуховування вліпу, кількість жінок в оркестрах зросла у п'ять разів.

Next, we have to consider accuracy. This is where the value-added model for teachers would fail immediately. No algorithm is perfect, of course, so we have to consider the errors of every algorithm. How often are there errors, and for whom does this model fail? What is the cost of that failure?

Потім нам потрібно розглянути точність. Ось тут модель з розширеними функціями для вчителів одразу б провалилася. Звісно, що не існує ідеальних алгоритмів, тому нам треба приймати до уваги помилки у кожному алгоритмі. Як часто там трапляються помилки, і кого підведе ця модель? Якою є ціна цього провалу?

And finally, we have to consider the long-term effects of algorithms, the feedback loops that are engendering. That sounds abstract, but imagine if Facebook engineers had considered that before they decided to show us only things that our friends had posted.

І наприкінці, нам потрібно прийняти до уваги довготермінові ефекти алгоритмів, ланцюги зворотного зв'язку, що виникають. Звучить абстрактно, але уявіть, якщо інженери Facebook прийняли б це до уваги, перш ніж вони вирішили показувати нам лише те, що постять наші друзі.

I have two more messages, one for the data scientists out there. Data scientists: we should not be the arbiters of truth. We should be translators of ethical discussions that happen in larger society.

В мене є ще дві думки, що я хочу донести, одна для науковців з даних. Науковці з даних: нам не слід бути арбітрами правди. Нам слід бути перекладачами етичних дискусій, що відбуваються у ширшому суспільстві.

(Applause)

(Оплески)

And the rest of you, the non-data scientists: this is not a math test. This is a political fight. We need to demand accountability for our algorithmic overlords.

А щодо решти з вас, не-науковців з даних: це не тест з математики. Це політична боротьба. Ми повинні вимагати підзвітності від наших алгоритмічних можновладців.

(Applause)

(Оплески)

The era of blind faith in big data must end.

Епоха сліпої віри у великі дані має підійти до кінця.

Thank you very much.

Дуже вам дякую.

(Applause)

(Оплески)

(Laughter)

(Сміх)

(Laughter)

(Сміх)

What is that?

Що це таке?

(Laughter)

(Сміх)

That should never have been used for individual assessment. It's almost a random number generator.

Це ніколи не слід було використовувати для індивідуальної оцінки. Це майже як генератор випадкових чисел.

(Applause)

(Оплески)

This is Roger Ailes.

Це - Роджер Ейлс.

(Laughter)

(Сміх)

(Laughter)

(Сміх)

(Applause)

(Оплески)

(Applause)

(Оплески)

And the rest of you, the non-data scientists: this is not a math test. This is a political fight. We need to demand accountability for our algorithmic overlords.

(Applause)

(Оплески)

The era of blind faith in big data must end.

Епоха сліпої віри у великі дані має підійти до кінця.

Thank you very much.

Дуже вам дякую.

(Applause)

(Оплески)

Cathy O'Neil: The era of blind faith in big data must end

Cathy O'Neil: The era of blind faith in big data must end

Related talks

Tricia Wang: The human insights missing from big data

Mona Chalabi: 3 ways to spot a bad statistic

Mallory Freeman: Your company's data could help end world hunger

Christian Rudder: Inside OKCupid: The math of online dating

Zeynep Tufekci: Machine intelligence makes human morals more important

Amy Webb: How I hacked online dating

Related talks

Tricia Wang: The human insights missing from big data

Mona Chalabi: 3 ways to spot a bad statistic

Mallory Freeman: Your company's data could help end world hunger

Christian Rudder: Inside OKCupid: The math of online dating

Zeynep Tufekci: Machine intelligence makes human morals more important

Amy Webb: How I hacked online dating