Cathy O'Neil: The era of blind faith in big data must end

Algorithms are everywhere. They sort and separate the winners from the losers. The winners get the job or a good credit card offer. The losers don't even get an interview or they pay more for insurance. We're being scored with secret formulas that we don't understand that often don't have systems of appeal. That begs the question: What if the algorithms are wrong?

האלגוריתמים נמצאים בכל מקום. הם ממיינים אנשים ומפרידים בין מנצחים למפסידים. המנצחים זוכים במשרה הנחשקת או בהצעה לכרטיס אשראי טוב. המפסידים לא זוכים אפילו בראיון או משלמים יותר על הביטוח. נוסחאות סודיות שאיננו מבינים מדרגות אותנו, ובדרך כלל אין אפשרות לערער על החלטותיהן. מתבקשת השאלה: מה אם האלגוריתמים טועים?

To build an algorithm you need two things: you need data, what happened in the past, and a definition of success, the thing you're looking for and often hoping for. You train an algorithm by looking, figuring out. The algorithm figures out what is associated with success. What situation leads to success?

כדי לבנות אלגוריתם נחוצים שני דברים: נתונים: מה קרה בעבר, והגדרה של הצלחה, מה שאתם רוצים או מקווים לו. האלגוריתם לומד ע"י... האלגוריתם מזהה מה מתקשר להצלחה. אילו מצבים מובילים להצלחה?

Actually, everyone uses algorithms. They just don't formalize them in written code. Let me give you an example. I use an algorithm every day to make a meal for my family. The data I use is the ingredients in my kitchen, the time I have, the ambition I have, and I curate that data. I don't count those little packages of ramen noodles as food.

בעצם, כולנו משתמשים באלגוריתמים, אבל לא מנסחים אותם בצורת קוד כתוב. אתן לכם דוגמה. אני משתמשת בכל יום באלגוריתם כדי להכין למשפחתי ארוחה. הנתונים בהם אני משתמשת הם המוצרים במטבח שלי, הזמן שעומד לרשותי, השאיפות שלי, ואני מארגנת את הנתונים. אני לא מחשיבה "מנה חמה" כמזון.

(Laughter)

(צחוק)

My definition of success is: a meal is successful if my kids eat vegetables. It's very different from if my youngest son were in charge. He'd say success is if he gets to eat lots of Nutella. But I get to choose success. I am in charge. My opinion matters. That's the first rule of algorithms.

ההגדרה שלי להצלחה: ארוחה נחשבת למוצלחת אם הילדים שלי אוכלים ירקות. אם בני הצעיר יהיה אחראי לכך זה יהיה אחרת לגמרי. הוא יגיד שהצלחה פירושה שהוא אכל הרבה חמאת-בוטנים. אבל אני היא זו שבוחרת מהי הצלחה. אני האחראית. הדעה שלי קובעת. זהו החוק הראשון של האלגוריתמים.

Algorithms are opinions embedded in code. It's really different from what you think most people think of algorithms. They think algorithms are objective and true and scientific. That's a marketing trick. It's also a marketing trick to intimidate you with algorithms, to make you trust and fear algorithms because you trust and fear mathematics. A lot can go wrong when we put blind faith in big data.

אלגוריתמים הם דעות שמוטמעות בקוד. זה שונה מאד ממה שרוב האנשים חושבים על אלגוריתמים. הם חושבים שהאלגוריתמים הם אובייקטיביים, נכונים ומדעיים. זו תחבולה שיווקית. תחבולה שיווקית נוספת היא להפחיד אתכם באלגוריתמים, כדי שתבטחו בהם ותחששו מהם כי אתם בוטחים במתמטיקה וחוששים ממנה. הרבה יכול להשתבש כשאנחנו נותנים אמון עיוור בנתוני-עתק.

This is Kiri Soares. She's a high school principal in Brooklyn. In 2011, she told me her teachers were being scored with a complex, secret algorithm called the "value-added model." I told her, "Well, figure out what the formula is, show it to me. I'm going to explain it to you." She said, "Well, I tried to get the formula, but my Department of Education contact told me it was math and I wouldn't understand it."

זו קירי סוארז, מנהלת בי"ס תיכון בברוקלין. ב-2011 היא אמרה לי שהמורים שלה מדורגים בעזרת אלגוריתם סודי ומורכב, שנקרא "מודל הערך המוסף". אמרתי לה, "תבררי מהי הנוסחה ותראי לי אותה. "אני אסביר לך אותה" היא אמרה, "ניסיתי לקבל את הנוסחה. "אך במשרד החינוך אמרו לי שזאת מתמטיקה, "ושאני לא אבין אותה."

It gets worse. The New York Post filed a Freedom of Information Act request, got all the teachers' names and all their scores and they published them as an act of teacher-shaming. When I tried to get the formulas, the source code, through the same means, I was told I couldn't. I was denied. I later found out that nobody in New York City had access to that formula. No one understood it. Then someone really smart got involved, Gary Rubinstein. He found 665 teachers from that New York Post data that actually had two scores. That could happen if they were teaching seventh grade math and eighth grade math. He decided to plot them. Each dot represents a teacher.

זה נהיה יותר גרוע. ה"ניו-יורק פוסט" הגיש בקשה לפי חוק חופש המידע. קיבל את כל שמות המורים והדירוג שלהם, ופירסם אותן כצעד של ביוש מורים. כשניסיתי להשיג את הנוסחאות, את הקוד המקורי, באותם האמצעים, אמרו לי, "אי-אפשר". דחו אותי. מאוחר יותר גיליתי שלאף אחד בעיר ניו-יורק אין גישה לנוסחה ההיא. שאיש לא מבין אותה. ואז נכנס לתמונה מישהו ממש חכם. גרי רובינשטיין. הוא זיהה בנתוני ה"ניו-יורק פוסט" 665 מורים עם שני דירוגים. זה יכול היה לקרות אם הם לימדו מתמטיקה בכיתות ז' וגם בכיתות ח'. הוא החליט להציג זאת בגרף. כל נקודה מסמלת מורה.

(Laughter)

(צחוק)

What is that?

מה זה?

(Laughter)

(צחוק)

That should never have been used for individual assessment. It's almost a random number generator.

זה לא משהו שאמור לשמש לצורך הערכות אישיות. זהו כמעט מחולל מספרים אקראי.

(Applause)

(מחיאות כפיים)

But it was. This is Sarah Wysocki. She got fired, along with 205 other teachers, from the Washington, DC school district, even though she had great recommendations from her principal and the parents of her kids.

אך זה שימש לכך. זוהי שרה וויסוקי. היא פוטרה יחד עם עוד 205 מורים מהמחוז הבית-סיפרי של וושינגטון הבירה, למרות שהיו לה המלצות מעולות מהנהלת ביה"ס וגם מההורים של הילדים שלימדה.

I know what a lot of you guys are thinking, especially the data scientists, the AI experts here. You're thinking, "Well, I would never make an algorithm that inconsistent." But algorithms can go wrong, even have deeply destructive effects with good intentions. And whereas an airplane that's designed badly crashes to the earth and everyone sees it, an algorithm designed badly can go on for a long time, silently wreaking havoc.

אני יודעת שרבים מכם חושבים, במיוחד חוקרי הנתונים ומומחי הבינה המלאכותית שכאן, "אני אף פעם לא אכתוב אלגוריתם כל-כך לא עיקבי." אבל אלגוריתמים יכולים לטעות, ואפילו לגרום לתוצאות הרסניות ביותר מתוך כוונות טובות. ובעוד שמטוס שתוכנן גרוע מתרסק וכולם רואים זאת, הרי כשאלגוריתם מעוצב גרוע, הוא יכול לעבוד הרבה זמן ולזרוע בשקט תוהו.

This is Roger Ailes.

זהו רוג'ר איילס.

(Laughter)

(צחוק)

He founded Fox News in 1996. More than 20 women complained about sexual harassment. They said they weren't allowed to succeed at Fox News. He was ousted last year, but we've seen recently that the problems have persisted. That begs the question: What should Fox News do to turn over another leaf?

הוא ייסד את "חדשות פוקס" ב-1996. יותר מ-20 נשים התלוננו על הטרדה מינית ואמרו שהן לא הירשו להן להצליח ב"חדשות פוקס". הוא הודח בשנה שעברה, אך לאחרונה נודע לנו שהבעיה נמשכת. נשאלת השאלה: מה צריכה רשת "חדשות פוקס" לעשות כדי לפתוח דף חדש?

Well, what if they replaced their hiring process with a machine-learning algorithm? That sounds good, right? Think about it. The data, what would the data be? A reasonable choice would be the last 21 years of applications to Fox News. Reasonable. What about the definition of success? Reasonable choice would be, well, who is successful at Fox News? I guess someone who, say, stayed there for four years and was promoted at least once. Sounds reasonable. And then the algorithm would be trained. It would be trained to look for people to learn what led to success, what kind of applications historically led to success by that definition. Now think about what would happen if we applied that to a current pool of applicants. It would filter out women because they do not look like people who were successful in the past.

מה אם הם יחליפו את תהליך ההעסקה שלהם באלגוריתם של למידת-מכונה? נשמע טוב, נכון? חישבו על זה. הנתונים, מה הם יהיו? הגיוני שאלה יהיו נתוני 21 השנים האחרונות של בקשות עבודה ב"חדשות פוקס". הגיוני. מה לגבי ההגדרה להצלחה? בחירה הגיונית תהיה, מי מצליח ב"חדשות פוקס"? אולי מישהו שעובד שם כבר 4 שנים, וקיבל קידום לפחות פעם אחת. נשמע הגיוני. ואז האלגוריתם יעבור לימוד. הוא ילמד לחפש אנשים כדי ללמוד מה הוביל להצלחה, אילו מועמדים הפכו לעובדים מוצלחים, לפי ההגדרה הזו. עכשיו חישבו מה יקרה אם ניישם זאת למאגר מועמדים בהווה: האלגוריתם יסנן החוצה נשים, כי הן אינן דומות לאנשים שהצליחו בעבר.

Algorithms don't make things fair if you just blithely, blindly apply algorithms. They don't make things fair. They repeat our past practices, our patterns. They automate the status quo. That would be great if we had a perfect world, but we don't. And I'll add that most companies don't have embarrassing lawsuits, but the data scientists in those companies are told to follow the data, to focus on accuracy. Think about what that means. Because we all have bias, it means they could be codifying sexism or any other kind of bigotry.

האלגוריתמים אינם מתקנים את העולם אם מיישמים אותם בשמחה ובעיוורון הם לא מתקנים את העולם אלא רק חוזרים על מה שעשינו בעבר, על הדפוסים שלנו. הם הופכים את המצב הקיים לאוטומטי. היה נהדר אם היה לנו עולם מושלם, אבל אין לנו. ואני אוסיף שרוב החברות לא מתמודדות עם תביעות מביכות, אבל מומחי הנתונים בחברות אלה מחוייבים לציית לנתונים, להתמקד בדיוק. חישבו מה זה אומר, הרי לכולנו יש הטיות. אולי הם מתכנתים לאפליה על רקע מין, או כל סוג אחר של גזענות.

Thought experiment, because I like them: an entirely segregated society -- racially segregated, all towns, all neighborhoods and where we send the police only to the minority neighborhoods to look for crime. The arrest data would be very biased. What if, on top of that, we found the data scientists and paid the data scientists to predict where the next crime would occur? Minority neighborhood. Or to predict who the next criminal would be? A minority. The data scientists would brag about how great and how accurate their model would be, and they'd be right.

ניסוי מחשבתי, כי אני אוהבת כאלה: דמיינו חברה שלמה שמופרדת לפי גזעים: כל העיירות, כל השכונות, ואת המשטרה שולחים לחפש פשיעה רק בשכונות של מיעוטים. נתוני המעצרים יהיו מוטים בצורה מובהקת. מה אם בנוסף, מצאנו מומחה לנתונים ושילמנו לו כדי שינבא איפה יקרה הפשע הבא? שכונת מיעוטים. או כדי שינבא מי יהיה הפושע הבא? בן מיעוטים. מומחי הנתונים יתפארו כמה נהדר ומדוייק המודל שלהם. והם יצדקו.

Now, reality isn't that drastic, but we do have severe segregations in many cities and towns, and we have plenty of evidence of biased policing and justice system data. And we actually do predict hotspots, places where crimes will occur. And we do predict, in fact, the individual criminality, the criminality of individuals. The news organization ProPublica recently looked into one of those "recidivism risk" algorithms, as they're called, being used in Florida during sentencing by judges. Bernard, on the left, the black man, was scored a 10 out of 10. Dylan, on the right, 3 out of 10. 10 out of 10, high risk. 3 out of 10, low risk. They were both brought in for drug possession. They both had records, but Dylan had a felony but Bernard didn't. This matters, because the higher score you are, the more likely you're being given a longer sentence.

המציאות לא כל כך דרסטית, אבל יש לנו באמת הפרדה חמורה בערים ועיירות רבות, ויש לנו שפע ראיות להטיות בנתונים המשטרתיים מוטים ובמערכת המשפט. ואנחנו אכן חוזים נקודות סיכון, מקומות בהם יקרו פשעים. ואנחנו גם מנבאים את מידת הנטיה האישית לפשוע. את נטייתם של אנשים מסויימים לפשוע. סוכנות החדשות "פרופבליקה" בחנה לאחרונה אחד אותם אלגוריתמים ל"ניבוי הישנות פשיעה" כמו שקוראים להם. בפלורידה משתמשים בהם שופטים בזמן חריצת גזר הדין. ברנרד, משמאל, הגבר השחור קיבל 10 נקודות מתך 10. דילן, מימין - 3 מתוך 10. 10 מתוך 10 - סיכון גבוה. 3 מתוך 10 - סיכון נמוך. שניהם נעצרו על החזקת סמים. לשניהם היה כבר תיק. אבל דילן עבר עבירה וברנרד - לא. זה משנה, כי ככל שתקבל ניקוד יותר גבוה, גובר הסיכוי שתקבל עונש מאסר ארוך יותר.

What's going on? Data laundering. It's a process by which technologists hide ugly truths inside black box algorithms and call them objective; call them meritocratic. When they're secret, important and destructive, I've coined a term for these algorithms: "weapons of math destruction."

מה קורה פה? הלבנת נתונים. זהו תהליך שבו אנשי הטכנולוגיה מסתירים אמיתות מכוערות בתוך אלגוריתמים חתומים ואומרים שהם "אובייקטיביים", שזאת מריטוקרטיה. טבעתי כינוי לאלגוריתמים סודיים, חשובים והרסניים אלו: "נשק להשמדה מתמטית".

(Laughter)

(צחוק)

(Applause)

(מחיאות כפיים)

They're everywhere, and it's not a mistake. These are private companies building private algorithms for private ends. Even the ones I talked about for teachers and the public police, those were built by private companies and sold to the government institutions. They call it their "secret sauce" -- that's why they can't tell us about it. It's also private power. They are profiting for wielding the authority of the inscrutable. Now you might think, since all this stuff is private and there's competition, maybe the free market will solve this problem. It won't. There's a lot of money to be made in unfairness.

הם בכל מקום וזו לא טעות: מדובר בחברות פרטיות שכותבות אלגוריתמים פרטיים לצרכיהן הפרטיים. אפילו אלו שהזכרתי, שמשמשים להערכה של מורים ולשיטור נכתבו בידי חברות פרטיות ונמכרו למוסדות ממשלתיים. הם אומרים שזה "הרוטב הסודי" שלהם ולכן אינם יכולים לחשוף אותו. זהו גם כוח פרטי. הם מרוויחים מהפעלת כוח העמימות. אתם יכולים לחשוב, "בגלל שכל זה פרטי "וישנה תחרות, "השוק החופשי אולי יפתור את הבעיה." לא נכון. אפשר להרוויח הרבה כסף מחוסר הוגנות.

Also, we're not economic rational agents. We all are biased. We're all racist and bigoted in ways that we wish we weren't, in ways that we don't even know. We know this, though, in aggregate, because sociologists have consistently demonstrated this with these experiments they build, where they send a bunch of applications to jobs out, equally qualified but some have white-sounding names and some have black-sounding names, and it's always disappointing, the results -- always.

אנחנו גם לא יצורים רציונליים מבחינה כלכלית לכולנו דעות קדומות. כולנו גזענים ומוטים למרות שהיינו מעדיפים לא להיות כאלה, ובדרכים שאיננו אפילו יודעים. אבל אנחנו יודעים שבמצטבר, בגלל שסוציולוגים מראים באופן עקבי בניסויים שהם עורכים, שבהם הם שולחים למעסיקים הרבה קורות חיים עם כישורים זהים, כשחלק מהשמות נשמעים "לבנים", ושמות אחרים נשמעים "שחורים", והתוצאות של הניסויים תמיד מאכזבות, תמיד.

So we are the ones that are biased, and we are injecting those biases into the algorithms by choosing what data to collect, like I chose not to think about ramen noodles -- I decided it was irrelevant. But by trusting the data that's actually picking up on past practices and by choosing the definition of success, how can we expect the algorithms to emerge unscathed? We can't. We have to check them. We have to check them for fairness.

אז אנחנו בעלי הדעות הקדומות, ואנחנו מחדירים את ההטיות האלו לתוך האלגוריתמים בכך שאנו בוחרים אילו נתונים יש לאסוף, כמו שאני החלטתי לא להתייחס ל"מנה חמה"- החלטתי שהיא איננה רלוונטית. אבל אם אנחנו בוטחים בנתונים ובהגדרת ההצלחה על יסוד גישות קודמות, איך אנחנו יכולים לצפות שהאלגוריתמים ייצאו ללא פגע? ממש לא. אנחנו מוכרחים לבדוק אותם. אנחנו מוכרחים לוודא שהם הוגנים.

The good news is, we can check them for fairness. Algorithms can be interrogated, and they will tell us the truth every time. And we can fix them. We can make them better. I call this an algorithmic audit, and I'll walk you through it.

החדשות הטובות הן: זה אפשרי. אפשר לחקור אלגוריתמים והם יגידו לנו תמיד את האמת. ואנחנו יכולים לתקן ולשפר אותם. אני קוראת לזה "בדיקת אלגוריתם" אסביר לכם איך זה נעשה.

First, data integrity check. For the recidivism risk algorithm I talked about, a data integrity check would mean we'd have to come to terms with the fact that in the US, whites and blacks smoke pot at the same rate but blacks are far more likely to be arrested -- four or five times more likely, depending on the area. What is that bias looking like in other crime categories, and how do we account for it?

ראשית מוודאים את שלמות הנתונים. באלגוריתם "הישנות הפשיעה" שהזכרתי, בדיקת שלמות הנתונים פירושה שמוכרחים להשלים עם העובדה שבארה"ב, הלבנים והשחורים מעשנים מריחואנה באותה מידה אבל לשחורים יש סיכוי גבוה יותר להיעצר - סיכוי גבוה פי ארבעה או חמישה, תלוי באיזור. איך נראית ההטיה בתחומי פשע אחרים, ואיך אנחנו מסבירים אותה?

Second, we should think about the definition of success, audit that. Remember -- with the hiring algorithm? We talked about it. Someone who stays for four years and is promoted once? Well, that is a successful employee, but it's also an employee that is supported by their culture. That said, also it can be quite biased. We need to separate those two things. We should look to the blind orchestra audition as an example. That's where the people auditioning are behind a sheet. What I want to think about there is the people who are listening have decided what's important and they've decided what's not important, and they're not getting distracted by that. When the blind orchestra auditions started, the number of women in orchestras went up by a factor of five.

שנית, אנחנו צריכים להגדיר מחדש מהי הצלחה. לבדוק את הנושא. זוכרים את האלגוריתם לשכירת עובדים? דיברנו על זה. עובד המועסק כבר ארבע שנים וקודם פעם אחת? זה באמת עובד מצליח, אבל זה גם עובד שהסביבה התרבותית תומכת בו. אבל גם כאן יכולות להיות דעות קדומות. צריך להפריד בין שני הדברים. למשל בבחינות קבלה עיוורות, כשהבוחנים נמצאים מאחורי מסך. אני רוצה לחשוב שכאן, האנשים המקשיבים הם שהחליטו מה חשוב ומה לא, ודעתם לא מוסחת ע"י זה. כשהתחילו המבחנים העיוורים, מספר הנשים המנגנות בתזמורת גדל פי חמש.

Next, we have to consider accuracy. This is where the value-added model for teachers would fail immediately. No algorithm is perfect, of course, so we have to consider the errors of every algorithm. How often are there errors, and for whom does this model fail? What is the cost of that failure?

הבא בתור הוא הדיוק. כאן אלגוריתם הערך המוסף לדירוג מורים ייכשל מיד. אין אלגוריתם מושלם, כמובן, אז צריך לקחת בחשבון את השגיאות של כל אלגוריתם: כמה ומתי הן קורות ועם מי המודל הזה נכשל? מהו המחיר של הכשלון הזה?

And finally, we have to consider the long-term effects of algorithms, the feedback loops that are engendering. That sounds abstract, but imagine if Facebook engineers had considered that before they decided to show us only things that our friends had posted.

ולסיום, אנחנו מוכרחים לקחת בחשבון את ההשפעות ארוכות הטווח של האלגוריתמים, של לולאות המשוב שנוצרות. זה נשמע מופשט, אבל מה אם מהנדסי "פייסבוק" היו לוקחים זאת בחשבון בטרם החליטו להראות לנו רק מה ששיתפו החברים שלנו.

I have two more messages, one for the data scientists out there. Data scientists: we should not be the arbiters of truth. We should be translators of ethical discussions that happen in larger society.

יש לי עוד שני מסרים, אחד למתכנתים באשר הם: מתכנתים: אסור לנו לתווך את האמת. אנחנו צריכים לתת ביטוי לדיוני מוסר שמתקיימים בחברה כולה.

(Applause)

(מחיאות כפיים)

And the rest of you, the non-data scientists: this is not a math test. This is a political fight. We need to demand accountability for our algorithmic overlords.

ולשאר האנשים, אלו שאינם עוסקים במידע: לא מדובר במבחן במתמטיקה, אלא במאבק פוליטי. אנחנו צריכים לדרוש משליטי האלגוריתמים לקחת אחריות.

(Applause)

(מחיאות כפיים)

The era of blind faith in big data must end.

עידן האמון העיוור בנתונים חייב להסתיים.

Thank you very much.

תודה רבה.

(Applause)

(מחיאות כפיים)

(Laughter)

(צחוק)

(Laughter)

(צחוק)

What is that?

מה זה?

(Laughter)

(צחוק)

That should never have been used for individual assessment. It's almost a random number generator.

זה לא משהו שאמור לשמש לצורך הערכות אישיות. זהו כמעט מחולל מספרים אקראי.

(Applause)

(מחיאות כפיים)

This is Roger Ailes.

זהו רוג'ר איילס.

(Laughter)

(צחוק)

(Laughter)

(צחוק)

(Applause)

(מחיאות כפיים)

(Applause)

(מחיאות כפיים)

And the rest of you, the non-data scientists: this is not a math test. This is a political fight. We need to demand accountability for our algorithmic overlords.

(Applause)

(מחיאות כפיים)

The era of blind faith in big data must end.

עידן האמון העיוור בנתונים חייב להסתיים.

Thank you very much.

תודה רבה.

(Applause)

(מחיאות כפיים)

Cathy O'Neil: The era of blind faith in big data must end

Cathy O'Neil: The era of blind faith in big data must end

Related talks

Tricia Wang: The human insights missing from big data

Mona Chalabi: 3 ways to spot a bad statistic

Mallory Freeman: Your company's data could help end world hunger

Christian Rudder: Inside OKCupid: The math of online dating

Zeynep Tufekci: Machine intelligence makes human morals more important

Amy Webb: How I hacked online dating

Related talks

Tricia Wang: The human insights missing from big data

Mona Chalabi: 3 ways to spot a bad statistic

Mallory Freeman: Your company's data could help end world hunger

Christian Rudder: Inside OKCupid: The math of online dating

Zeynep Tufekci: Machine intelligence makes human morals more important

Amy Webb: How I hacked online dating