Cathy O'Neil: The era of blind faith in big data must end

Algorithms are everywhere. They sort and separate the winners from the losers. The winners get the job or a good credit card offer. The losers don't even get an interview or they pay more for insurance. We're being scored with secret formulas that we don't understand that often don't have systems of appeal. That begs the question: What if the algorithms are wrong?

Hay algoritmos por todos lados. Ordenan y separan a los ganadores de los perdedores. Los ganadores consiguen el trabajo o buenas condiciones de crédito. A los perdedores ni siquiera se les invita a una entrevista o tienen que pagar más por el seguro. Se nos califica mediante fórmulas secretas que no entendemos y a las que no se puede apelar. Eso plantea una pregunta: ¿Qué pasa si los algoritmos se equivocan?

To build an algorithm you need two things: you need data, what happened in the past, and a definition of success, the thing you're looking for and often hoping for. You train an algorithm by looking, figuring out. The algorithm figures out what is associated with success. What situation leads to success?

Un algoritmo necesita dos cosas: datos ocurridos en el pasado y una definición del éxito; esto es, lo que uno quiere y lo que desea. Los algoritmos se entrenan mirando, descubriendo. El algoritmo calcula a qué se asocia el éxito, qué situaciones llevan al éxito.

Actually, everyone uses algorithms. They just don't formalize them in written code. Let me give you an example. I use an algorithm every day to make a meal for my family. The data I use is the ingredients in my kitchen, the time I have, the ambition I have, and I curate that data. I don't count those little packages of ramen noodles as food.

En general todos usamos algoritmos pero no los formalizamos mediante un código escrito. Les doy un ejemplo. Yo uso un algoritmo todos los días para preparar la comida en casa. Los datos que uso son los ingredientes de la cocina, el tiempo que tengo y lo ambiciosa que estoy. Y así organizo los datos. No incluyo esos paquetitos de fideos como comida.

(Laughter)

(Risas)

My definition of success is: a meal is successful if my kids eat vegetables. It's very different from if my youngest son were in charge. He'd say success is if he gets to eat lots of Nutella. But I get to choose success. I am in charge. My opinion matters. That's the first rule of algorithms.

Mi definición del éxito es: la comida tiene éxito, si mis hijos comen verdura. Lo que sería muy distinto, si mi hijito tuviera el control. Para él el éxito es comer mucha Nutella. Pero yo soy quien elige el éxito. Estoy al mando. Mi opinión cuenta. Esa es la primera regla de los algoritmos.

Algorithms are opinions embedded in code. It's really different from what you think most people think of algorithms. They think algorithms are objective and true and scientific. That's a marketing trick. It's also a marketing trick to intimidate you with algorithms, to make you trust and fear algorithms because you trust and fear mathematics. A lot can go wrong when we put blind faith in big data.

Los algoritmos son opiniones que se embeben en código. Es muy diferente a cómo la gente se imagina los algoritmos. Se creen que los algoritmos son objetivos, verdaderos y científicos. Ese en un truco del marketing. Tambien es un truco del marketing la intimidación con algoritmos, que nos hacer confiar y temer los algoritmos porque confiamos y tememos las matemáticas. Muchas cosas pueden salir mal si confiamos a ciegas en datos masivos.

This is Kiri Soares. She's a high school principal in Brooklyn. In 2011, she told me her teachers were being scored with a complex, secret algorithm called the "value-added model." I told her, "Well, figure out what the formula is, show it to me. I'm going to explain it to you." She said, "Well, I tried to get the formula, but my Department of Education contact told me it was math and I wouldn't understand it."

Esta es Kiri Soares. Es la directora de una escuela de Brooklyn. En 2011 me contó que sus maestros se clasificaban mediante un algoritmo complejo y secreto llamado "modelo del valor añadido". Le dije, "Intente saber cuál es la fórmula, muéstremela. Se la voy a explicar". Me respondió, "Trate de conseguir la fórmula, pero un conocido del Departamento de Educación me dijo que era matemática y que no la entendería".

It gets worse. The New York Post filed a Freedom of Information Act request, got all the teachers' names and all their scores and they published them as an act of teacher-shaming. When I tried to get the formulas, the source code, through the same means, I was told I couldn't. I was denied. I later found out that nobody in New York City had access to that formula. No one understood it. Then someone really smart got involved, Gary Rubinstein. He found 665 teachers from that New York Post data that actually had two scores. That could happen if they were teaching seventh grade math and eighth grade math. He decided to plot them. Each dot represents a teacher.

Esto se pone peor. El New York Post la solicitó bajo la Ley de Libertad a la Información. Obtuvo los nombres de los maestros y su puntuación y los publicó como un acto para avergonzar a los maestros. Cuando intenté conseguir las fórmulas en código base, usando el mismo mecanismo, me dijeron que no se podía. Me lo negaron. Más tarde descubrí que nadie tenía derecho a la fórmula en Nueva York. Nadie lo podía entender. Entonces apareció un tipo muy inteligente, Gary Rubenstein. Localizó a 665 maestros por los datos del New York Post que tenían dos puntuaciones. Eso podía ocurrir si enseñaban matemática en 7º y 8º grado. Decidió hacer un gráfico. Donde cada punto representa a un maestro.

(Laughter)

(Risas)

What is that?

Y eso ¿qué es?

(Laughter)

(Risas)

That should never have been used for individual assessment. It's almost a random number generator.

Eso no debiera haberse usado nunca para evaluar a una persona. Es casi un generador de números al azar.

(Applause)

(Aplausos)

But it was. This is Sarah Wysocki. She got fired, along with 205 other teachers, from the Washington, DC school district, even though she had great recommendations from her principal and the parents of her kids.

Pero lo fue. Esta es Sarah Wysocki. La echaron junto a otros 205 maestros de una escuela en Washington DC, a pesar de tener muy buena recomendación de la directora y de los padres de sus alumnos.

I know what a lot of you guys are thinking, especially the data scientists, the AI experts here. You're thinking, "Well, I would never make an algorithm that inconsistent." But algorithms can go wrong, even have deeply destructive effects with good intentions. And whereas an airplane that's designed badly crashes to the earth and everyone sees it, an algorithm designed badly can go on for a long time, silently wreaking havoc.

Me imagino lo que estarán pensando, especialmente los cientificos de datos, los expertos en IA Pensarán "Nosotros nunca produciríamos un algoritmo tan inconsistente." Pero los algoritmos a veces fallan, y tambien provocar mucha destrucción sin querer. Y mientras un avión mal diseñado se estrella y todos lo ven, un algoritmo mal diseñado puede funcionar mucho tiempo provocando un desastre silenciosamente.

This is Roger Ailes.

Este es Roger Ailes.

(Laughter)

(Risas)

He founded Fox News in 1996. More than 20 women complained about sexual harassment. They said they weren't allowed to succeed at Fox News. He was ousted last year, but we've seen recently that the problems have persisted. That begs the question: What should Fox News do to turn over another leaf?

Fundador de Fox News en el 1996. Mas de 20 mujeres se quejaron de acoso sexual. Dijeron que no pudieron tener éxito en Fox News. Lo echaron el año pasado, pero hemos visto que hace poco los problemas han continuado. Esto plantea una pregunta: ¿Qué debe hacer Fox News para cambiar?

Well, what if they replaced their hiring process with a machine-learning algorithm? That sounds good, right? Think about it. The data, what would the data be? A reasonable choice would be the last 21 years of applications to Fox News. Reasonable. What about the definition of success? Reasonable choice would be, well, who is successful at Fox News? I guess someone who, say, stayed there for four years and was promoted at least once. Sounds reasonable. And then the algorithm would be trained. It would be trained to look for people to learn what led to success, what kind of applications historically led to success by that definition. Now think about what would happen if we applied that to a current pool of applicants. It would filter out women because they do not look like people who were successful in the past.

Y si substituyeran su mecanismo de contratación con un algoritmo de auto- aprendizaje automatizado? ¿Suena bien? Piénsenlo, Los datos, ¿qué datos serían? Una eleccion razonable serian las últimas 21 solicitudes recibidas por Fox News Razonable. Y ¿cuál sería la definición del éxito? Algo razonable sería preguntar, quién es exitoso en Fox News. Me imagino que alguien que hubiera estado alli unos 4 años y subido de puesto por lo menosuna vez. ¿Suena razonable? Y así se adiestraría el algoritmo. Se adiestraría para buscar a gente que logra el éxito. Y qué solicitudes antiguas llegaron al éxito según esa definición. Ahora piensen que ocurriría si lo usáramos con los candidatos de hoy. Filtraría a las mujeres ya que no parecen ser personas que hayan tenido éxito en el pasado.

Algorithms don't make things fair if you just blithely, blindly apply algorithms. They don't make things fair. They repeat our past practices, our patterns. They automate the status quo. That would be great if we had a perfect world, but we don't. And I'll add that most companies don't have embarrassing lawsuits, but the data scientists in those companies are told to follow the data, to focus on accuracy. Think about what that means. Because we all have bias, it means they could be codifying sexism or any other kind of bigotry.

Los algoritmos no son justos si uno usa algoritmos a ciegas. No son justos. Repiten prácticas anteriores, nuestros patrones. Automatizan al status quo. Sería genial en un mundo perfecto, pero no lo tenemos. Y aclaro que la mayoria de las empresas no estan involucradas en litigios, pero los cientificos de datos de esas empresas emplean esos datos para lograr la precisión. Piensen qué significa esto. Porque todos tenemos prejuicios, y así podríamos codificar sexismo u otro tipo de fanatismo.

Thought experiment, because I like them: an entirely segregated society -- racially segregated, all towns, all neighborhoods and where we send the police only to the minority neighborhoods to look for crime. The arrest data would be very biased. What if, on top of that, we found the data scientists and paid the data scientists to predict where the next crime would occur? Minority neighborhood. Or to predict who the next criminal would be? A minority. The data scientists would brag about how great and how accurate their model would be, and they'd be right.

Un experimento de pensamiento, porque me gusta, una sociedad totalmente segregada. segregada racialmente, todas las ciudades y los barrios y donde enviamos a la policia solo a barrios minoritarios para detectar delitos. Los arrestos serían sesgados. Y, además, elegimos a los cientificos de datos y pagamos por los datos para predecir dónde ocurrirán los próximos delitos. El barrio de una minoría. O a predecir quien será el próximo criminal. Una minoría. Los cientificos de datos se jactarían de su grandeza y de la precisión de su modelo, y tendrían razón.

Now, reality isn't that drastic, but we do have severe segregations in many cities and towns, and we have plenty of evidence of biased policing and justice system data. And we actually do predict hotspots, places where crimes will occur. And we do predict, in fact, the individual criminality, the criminality of individuals. The news organization ProPublica recently looked into one of those "recidivism risk" algorithms, as they're called, being used in Florida during sentencing by judges. Bernard, on the left, the black man, was scored a 10 out of 10. Dylan, on the right, 3 out of 10. 10 out of 10, high risk. 3 out of 10, low risk. They were both brought in for drug possession. They both had records, but Dylan had a felony but Bernard didn't. This matters, because the higher score you are, the more likely you're being given a longer sentence.

La realidad no es tan drástica, pero tenemos grandes segregaciones en muchas ciudades y tenemos muchas pruebas de datos políticos y legislativos sesgados. Y podemos predecir puntos calientes, lugares donde podrá ocurrir un delito Y así predecir un crimen individual y la criminalidad de los individuos. El organismo de noticias ProPublica lo estudió hace poco. un algoritmo de "riesgo recidivista" según los llaman usado en Florida al hacer sentencias judiciales. Bernardo, a la izquierda, un hombre negro sacó una puntuación de 10 de 10. Dylan, a la derecha, 3 de 10. 10 de 10, alto riesgo 3 de 10, bajo riesgo. Los sentenciaron por tener drogas. Ambos con antecedentes penales pero Dylan habia cometido un delito Bernard, no. Esto importa porque a mayor puntuación mayor probabilidad de una sentencia más larga.

What's going on? Data laundering. It's a process by which technologists hide ugly truths inside black box algorithms and call them objective; call them meritocratic. When they're secret, important and destructive, I've coined a term for these algorithms: "weapons of math destruction."

¿Que sucede? Lavado de datos. El proceso que se usa para ocultar verdades feas dentro de una caja negra de algoritmos y llamarlos objetivos; llamándolos meritocráticos cuando son secretos, importantes y destructivos Les puse un nombre a estos algoritmos: "armas matemáticas de destrucción"

(Laughter)

(Risas)

(Applause)

(Aplausos)

They're everywhere, and it's not a mistake. These are private companies building private algorithms for private ends. Even the ones I talked about for teachers and the public police, those were built by private companies and sold to the government institutions. They call it their "secret sauce" -- that's why they can't tell us about it. It's also private power. They are profiting for wielding the authority of the inscrutable. Now you might think, since all this stuff is private and there's competition, maybe the free market will solve this problem. It won't. There's a lot of money to be made in unfairness.

Estan en todos sitios Son empresas privadas que construyen algoritmos privados para fines privados. Incluso los mencionados de los maestros y la policía pública fueron diseñados por empresas privadas y vendidos a instituciones gubernamentales. Lo llaman su "salsa secreta" por eso no nos pueden hablar de ello. Es un poder privado que saca provecho por su autoridad inescrutable. Entonces uno ha de pensar, ya que todo esto es privado y hay competición, tal vez un mercado libre podrá solucionarlo Pero no. Se puede ganar mucho dinero con la injusticia.

Also, we're not economic rational agents. We all are biased. We're all racist and bigoted in ways that we wish we weren't, in ways that we don't even know. We know this, though, in aggregate, because sociologists have consistently demonstrated this with these experiments they build, where they send a bunch of applications to jobs out, equally qualified but some have white-sounding names and some have black-sounding names, and it's always disappointing, the results -- always.

Tampoco somos agentes económicos racionales. Todos tenemos prejuicios Somos racistas y fanáticos de una forma que no quisiéramos, de maneras que desconocemos. Lo sabemos al sumarlo porque los sociólogos lo han demostrado consistentemente con experimentos que construyeron donde mandan una cantidad de solicitudes de empleo de personas de calificaciones iguales pero algunas con apellidos blancos y otras con apellidos negros, y los resultados siempre los decepcionan, siempre.

So we are the ones that are biased, and we are injecting those biases into the algorithms by choosing what data to collect, like I chose not to think about ramen noodles -- I decided it was irrelevant. But by trusting the data that's actually picking up on past practices and by choosing the definition of success, how can we expect the algorithms to emerge unscathed? We can't. We have to check them. We have to check them for fairness.

Nosotros somos los prejuiciosos que inyectamos prejuicios a nuestros algoritmos al elegir qué datos recoger, así como yo elegí no pensar en los fideos-- Y decidi que no era importante. Pero tenerle confianza a los datos basados en prácticas pasadas y eligiendo la definición del éxito, ¿cómo pretendemos que los algoritmos emerjan intactos? No podemos. Tenemos que verificarlos. Hay que revisarlos por equidad.

The good news is, we can check them for fairness. Algorithms can be interrogated, and they will tell us the truth every time. And we can fix them. We can make them better. I call this an algorithmic audit, and I'll walk you through it.

Y las buenas noticias son que los algoritmos pueden ser interrogados, y nos dirán la verdad todas las veces. Y los podemos arreglar. Y mejorarlos. Lo explico. Esto se llama revisión del algoritmo, lo explico.

First, data integrity check. For the recidivism risk algorithm I talked about, a data integrity check would mean we'd have to come to terms with the fact that in the US, whites and blacks smoke pot at the same rate but blacks are far more likely to be arrested -- four or five times more likely, depending on the area. What is that bias looking like in other crime categories, and how do we account for it?

Primero, verificación de integridad de datos. por el riesgo recidivista. La verificación de la integridad de datos implicaría una conciliación que en EE. UU. los blancos y los negros fuman marihuana pero a los negros es mas fácil que los arresten más probablemente cuatro o cinco veces más dependiendo de la zona. Y ¿cómo son los prejuicios en otras categorías criminales, y cómo lo justificamos?

Second, we should think about the definition of success, audit that. Remember -- with the hiring algorithm? We talked about it. Someone who stays for four years and is promoted once? Well, that is a successful employee, but it's also an employee that is supported by their culture. That said, also it can be quite biased. We need to separate those two things. We should look to the blind orchestra audition as an example. That's where the people auditioning are behind a sheet. What I want to think about there is the people who are listening have decided what's important and they've decided what's not important, and they're not getting distracted by that. When the blind orchestra auditions started, the number of women in orchestras went up by a factor of five.

Segundo, debemos pensar en la definición del éxito, revisarla. ¿Recuerdan el algoritmo de la contratación? alguien que se queda cuatro años y asciende de cargo una vez? Ese es el empleado exitoso, pero tambien es el empleado apoyado por la cultura. Esto puede ser bastante injusto. Tenemos que separar dos cosas. Mirar a la audicion de una orquesta de ciegos por ejemplo. Los que dan la audición están detrás de la partitura. Lo que quiero que piensen es que la gente que escucha decide lo que es importante y lo que no lo es, sin que eso nos distraiga. Cuando empezaron las audiciones de orquesta de ciegos la cantidad de mujeres aumentó un factor de cinco veces.

Next, we have to consider accuracy. This is where the value-added model for teachers would fail immediately. No algorithm is perfect, of course, so we have to consider the errors of every algorithm. How often are there errors, and for whom does this model fail? What is the cost of that failure?

Tambien hay que pensar en la precisión y así el modelo del valor añadido fallaría. Por supuesto ningún algoritmo es perfecto, asi que hay que considerar los errores de cada algoritmo. ¿Qué frecuencia tienen los errores y con quiénes falla? Y ¿cuál es el costo de dicha falla?

And finally, we have to consider the long-term effects of algorithms, the feedback loops that are engendering. That sounds abstract, but imagine if Facebook engineers had considered that before they decided to show us only things that our friends had posted.

Y por último, tenemos que considerar los efectos a largo plazo de los algoritmos, los bucles de retroalimentación que engendran. Eso suena a abstracto. Pero imagínese si los ingenieros de Facebook lo hubieran considerado antes de mostrarnos cosas publicadas por nuestros amigos.

I have two more messages, one for the data scientists out there. Data scientists: we should not be the arbiters of truth. We should be translators of ethical discussions that happen in larger society.

Tengo dos mensajes, uno para los científicos de datos. Cientificos de datos: no debemos ser los árbitros de la verdad. Debemos ser tradutores de las discusiones éticas que ocurren en toda la sociedad.

(Applause)

(Aplausos)

And the rest of you, the non-data scientists: this is not a math test. This is a political fight. We need to demand accountability for our algorithmic overlords.

Y para el resto de Uds. los que no son científicos de datos: esta no es un examen de matemáticas. Es una lucha politica. Tenemos que exigir responsabilidad a los lores de los algoritmos.

(Applause)

(Aplausos)

The era of blind faith in big data must end.

La era de la fe ciega en los datos masivos debe terminar.

Thank you very much.

Muchas gracias.

(Applause)

(Aplauso)

(Laughter)

(Risas)

(Laughter)

(Risas)

What is that?

Y eso ¿qué es?

(Laughter)

(Risas)

That should never have been used for individual assessment. It's almost a random number generator.

Eso no debiera haberse usado nunca para evaluar a una persona. Es casi un generador de números al azar.

(Applause)

(Aplausos)

Pero lo fue. Esta es Sarah Wysocki. La echaron junto a otros 205 maestros de una escuela en Washington DC, a pesar de tener muy buena recomendación de la directora y de los padres de sus alumnos.

This is Roger Ailes.

Este es Roger Ailes.

(Laughter)

(Risas)

(Laughter)

(Risas)

(Applause)

(Aplausos)

(Applause)

(Aplausos)

And the rest of you, the non-data scientists: this is not a math test. This is a political fight. We need to demand accountability for our algorithmic overlords.

Y para el resto de Uds. los que no son científicos de datos: esta no es un examen de matemáticas. Es una lucha politica. Tenemos que exigir responsabilidad a los lores de los algoritmos.

(Applause)

(Aplausos)

The era of blind faith in big data must end.

La era de la fe ciega en los datos masivos debe terminar.

Thank you very much.

Muchas gracias.

(Applause)

(Aplauso)

Cathy O'Neil: The era of blind faith in big data must end

Cathy O'Neil: The era of blind faith in big data must end

Related talks

Tricia Wang: The human insights missing from big data

Mona Chalabi: 3 ways to spot a bad statistic

Mallory Freeman: Your company's data could help end world hunger

Christian Rudder: Inside OKCupid: The math of online dating

Zeynep Tufekci: Machine intelligence makes human morals more important

Amy Webb: How I hacked online dating

Related talks

Tricia Wang: The human insights missing from big data

Mona Chalabi: 3 ways to spot a bad statistic

Mallory Freeman: Your company's data could help end world hunger

Christian Rudder: Inside OKCupid: The math of online dating

Zeynep Tufekci: Machine intelligence makes human morals more important

Amy Webb: How I hacked online dating