Cathy O'Neil: The era of blind faith in big data must end

Algorithms are everywhere. They sort and separate the winners from the losers. The winners get the job or a good credit card offer. The losers don't even get an interview or they pay more for insurance. We're being scored with secret formulas that we don't understand that often don't have systems of appeal. That begs the question: What if the algorithms are wrong?

Os algoritmos estão por toda parte. Eles selecionam e separam os vencedores dos perdedores. Os vencedores conseguem o emprego ou a oferta de um bom cartão de crédito. Os perdedores não conseguem nem mesmo uma entrevista. Ou pagam mais caro pelo seu seguro. Estamos sendo avaliados com fórmulas secretas que não entendemos, que geralmente não têm como serem contestadas. Isso coloca uma questão: e se os algoritmos estiverem errados?

To build an algorithm you need two things: you need data, what happened in the past, and a definition of success, the thing you're looking for and often hoping for. You train an algorithm by looking, figuring out. The algorithm figures out what is associated with success. What situation leads to success?

Precisamos de duas coisas para criar um algoritmo: de dados, o que aconteceu no passado, e uma definição de sucesso, aquilo que estamos procurando e geralmente esperando. Treinamos um algoritmo procurando, calculando. O algoritmo descobre o que está associado com o sucesso, que situação leva ao sucesso.

Actually, everyone uses algorithms. They just don't formalize them in written code. Let me give you an example. I use an algorithm every day to make a meal for my family. The data I use is the ingredients in my kitchen, the time I have, the ambition I have, and I curate that data. I don't count those little packages of ramen noodles as food.

Na verdade, todos usamos algoritmos, apenas não os formalizamos num código escrito. Querem um exemplo? Todo dia uso um algoritmo pra preparar as refeições da minha família. Os dados que uso são os ingredientes da minha cozinha, o tempo disponível, minha ambição, e quem seleciona os dados sou eu. Não conto um pacote de Miojo como comida.

(Laughter)

(Risos)

My definition of success is: a meal is successful if my kids eat vegetables. It's very different from if my youngest son were in charge. He'd say success is if he gets to eat lots of Nutella. But I get to choose success. I am in charge. My opinion matters. That's the first rule of algorithms.

Minha definição de sucesso é: uma refeição é um sucesso quando meus filhos comem verduras. Muito diferente se meu filho mais novo estiver no comando. Para ele, sucesso é comer montes de Nutella. Mas eu é que escolho o que é sucesso. Eu estou no comando; minha opinião conta. Essa é a primeira regra dos algoritmos.

Algorithms are opinions embedded in code. It's really different from what you think most people think of algorithms. They think algorithms are objective and true and scientific. That's a marketing trick. It's also a marketing trick to intimidate you with algorithms, to make you trust and fear algorithms because you trust and fear mathematics. A lot can go wrong when we put blind faith in big data.

Algoritmos são opiniões embutidas num código. Bem diferente do que a maioria de nós pensa sobre os algoritmos. Achamos que os algoritmos são objetivos, verdadeiros e científicos. Esse é um truque de marketing. É também um truque de marketing intimidar vocês com algoritmos, fazê-los acreditar nos algoritmos ou ter medo deles porque acreditamos na matemática, e temos medo dela. Muita coisa pode dar errado quando confiamos cegamente no Big Data.

This is Kiri Soares. She's a high school principal in Brooklyn. In 2011, she told me her teachers were being scored with a complex, secret algorithm called the "value-added model." I told her, "Well, figure out what the formula is, show it to me. I'm going to explain it to you." She said, "Well, I tried to get the formula, but my Department of Education contact told me it was math and I wouldn't understand it."

Esta é Kiri Soares, diretora de um colégio no Brooklyn. Em 2011, ela me disse que seus professores estavam sendo avaliados por um algoritmo complexo e secreto, chamado "modelo de valor agregado". Disse a ela: "Descubra a fórmula dele e me mostre. Aí, posso explicá-lo a você". Ela disse: "Tentei conseguir a fórmula, mas meu contato na Secretaria de Educação me falou que era matemática e que eu não iria entender".

It gets worse. The New York Post filed a Freedom of Information Act request, got all the teachers' names and all their scores and they published them as an act of teacher-shaming. When I tried to get the formulas, the source code, through the same means, I was told I couldn't. I was denied. I later found out that nobody in New York City had access to that formula. No one understood it. Then someone really smart got involved, Gary Rubinstein. He found 665 teachers from that New York Post data that actually had two scores. That could happen if they were teaching seventh grade math and eighth grade math. He decided to plot them. Each dot represents a teacher.

E a história só fica pior. O "New York Post" protocolou um pedido de transparência, pegou o nome de todos os professores, e todas suas avaliações e publicou como um ato para expor os professores. Quando tentei conseguir as fórmulas, o código-fonte, através dos mesmos meios, me disseram que não podia, me foi negado. Descobri mais tarde que ninguém em Nova Iorque tinha acesso àquela fórmula. Ninguém a entendia. Então, Gary Rubenstein, um cara muito inteligente, se envolveu. Ele descobriu 665 professores naqueles dados do "New York Post" que na verdade tinham duas avaliações. Aquilo podia acontecer se eles ensinavam matemática na sétima e na oitava série. Ele decidiu marcá-los. Cada ponto representa um professor.

(Laughter)

(Risos)

What is that?

O que é isto?

(Laughter)

(Risos)

That should never have been used for individual assessment. It's almost a random number generator.

Isso nunca deveria ter sido usado numa avaliação individual. É quase um gerador aleatório de número.

(Applause)

(Aplausos) (Vivas)

But it was. This is Sarah Wysocki. She got fired, along with 205 other teachers, from the Washington, DC school district, even though she had great recommendations from her principal and the parents of her kids.

Mas foi usado. Esta é Sarah Wysocki. Ela foi demitida, juntamente com 205 outros professores, da superintendência de ensino de Washington, D.C., mesmo tendo excelente recomendação de sua diretora e dos pais das crianças.

I know what a lot of you guys are thinking, especially the data scientists, the AI experts here. You're thinking, "Well, I would never make an algorithm that inconsistent." But algorithms can go wrong, even have deeply destructive effects with good intentions. And whereas an airplane that's designed badly crashes to the earth and everyone sees it, an algorithm designed badly can go on for a long time, silently wreaking havoc.

Muitos aqui devem estar pensando, especialmente cientistas de dados, os especialistas em IA: "Eu nunca faria um algoritmo inconsistente assim". Mas os algoritmos podem dar errado, mesmo os bem-intencionados podem ter efeitos profundamente destrutivos. E enquanto um avião mal projetado cai, e todo mundo vê, um algoritmo mal projetado pode continuar a causar destruição de forma silenciosa, por um longo tempo.

This is Roger Ailes.

Este é Roger Ailes.

(Laughter)

(Risos)

He founded Fox News in 1996. More than 20 women complained about sexual harassment. They said they weren't allowed to succeed at Fox News. He was ousted last year, but we've seen recently that the problems have persisted. That begs the question: What should Fox News do to turn over another leaf?

Ele fundou a Fox News em 1996. Mais de 20 mulheres reclamaram de assédio sexual. Elas disseram que não lhes foi permitido subir na Fox News. Ele foi afastado ano passado, mas vimos recentemente que os problemas continuaram. Uma pergunta se impõe aqui: o que a Fox News deveria fazer para virar essa página?

Well, what if they replaced their hiring process with a machine-learning algorithm? That sounds good, right? Think about it. The data, what would the data be? A reasonable choice would be the last 21 years of applications to Fox News. Reasonable. What about the definition of success? Reasonable choice would be, well, who is successful at Fox News? I guess someone who, say, stayed there for four years and was promoted at least once. Sounds reasonable. And then the algorithm would be trained. It would be trained to look for people to learn what led to success, what kind of applications historically led to success by that definition. Now think about what would happen if we applied that to a current pool of applicants. It would filter out women because they do not look like people who were successful in the past.

Que tal se eles substituírem seu processo de contratação por um algoritmo de aprendizado de máquina? Parece boa ideia, né? Pensem bem. Os dados, quais seriam os dados? Uma escolha razoável seria os últimos 21 anos de contratação da Fox News. Bem razoável. E a definição de sucesso? Seria uma escolha racional: quem é bem-sucedido para a Fox News? Digamos que seja alguém que tenha ficado lá por quatro anos e promovido pelo menos uma vez. Parece razoável. E então o algoritmo poderia ser treinado. Seria treinado para procurar pessoas para aprender o que leva ao sucesso, que tipo de contratações historicamente levaram ao sucesso segundo aquela definição. Agora pensem sobre o que aconteceria se aplicado a um conjunto atual de pedidos de emprego. Ele filtraria as mulheres, pois aparentemente elas não tiveram sucesso no passado.

Algorithms don't make things fair if you just blithely, blindly apply algorithms. They don't make things fair. They repeat our past practices, our patterns. They automate the status quo. That would be great if we had a perfect world, but we don't. And I'll add that most companies don't have embarrassing lawsuits, but the data scientists in those companies are told to follow the data, to focus on accuracy. Think about what that means. Because we all have bias, it means they could be codifying sexism or any other kind of bigotry.

Os algoritmos não tornam as coisas justas se forem aplicados de forma cega e displicente. Não tornam as coisas justas. Eles repetem nossas práticas passadas, nossos padrões. Eles automatizam o status quo. Isso seria ótimo se tivéssemos um mundo perfeito, mas não temos. E mais: a maioria das empresas não inclui os litígios constrangedores, mas os cientistas de dados dessas empresas são orientados a seguirem os dados, a terem rigor. Pensem no que isso significa. Como todos somos tendenciosos, significa que poderiam estar codificando sexismo ou qualquer outro tipo de intolerância.

Thought experiment, because I like them: an entirely segregated society -- racially segregated, all towns, all neighborhoods and where we send the police only to the minority neighborhoods to look for crime. The arrest data would be very biased. What if, on top of that, we found the data scientists and paid the data scientists to predict where the next crime would occur? Minority neighborhood. Or to predict who the next criminal would be? A minority. The data scientists would brag about how great and how accurate their model would be, and they'd be right.

Vamos fazer um exercício intelectual, pois gosto deles: uma sociedade inteiramente segregada, racialmente segregada, todas as cidades, todos os bairros, e onde enviamos a polícia apenas a bairros de minorias atrás de crimes. Os dados sobre os presos seriam muito tendenciosos. E se, além disso, pegássemos cientistas de dados e pagássemos a eles para predizerem onde vai ocorrer o próximo crime? Bairros de minorias. Ou predizer quem será o próximo criminoso? Alguém das minorias. Os cientistas de dados se gabariam da excelência e da precisão de seu modelo, e estariam certos.

Now, reality isn't that drastic, but we do have severe segregations in many cities and towns, and we have plenty of evidence of biased policing and justice system data. And we actually do predict hotspots, places where crimes will occur. And we do predict, in fact, the individual criminality, the criminality of individuals. The news organization ProPublica recently looked into one of those "recidivism risk" algorithms, as they're called, being used in Florida during sentencing by judges. Bernard, on the left, the black man, was scored a 10 out of 10. Dylan, on the right, 3 out of 10. 10 out of 10, high risk. 3 out of 10, low risk. They were both brought in for drug possession. They both had records, but Dylan had a felony but Bernard didn't. This matters, because the higher score you are, the more likely you're being given a longer sentence.

Bem, a realidade não é drástica assim, mas temos graves segregações em muitas cidades e vilas, e muitas evidências de dados policiais e judiciários tendenciosos. Na verdade, predizemos focos de crise, lugares onde crimes podem ocorrer. E predizemos, de fato, a criminalidade individual, a criminalidade dos indivíduos. A organização de notícias ProPublica recentemente estudou um desses algoritmos, chamados de "risco de recidiva", que têm sido usados por juízes na Flórida para proferirem sentenças. Bernard, à esquerda, o homem negro, atingiu dez em dez. Dylan, à direita, três em dez. Então, dez em dez, alto risco; três em dez, baixo risco. Ambos foram pegos por posse de droga. Ambos tinham antecedentes, e Dylan tinha um delito grave, mas Bernard não. Isso é importante, pois, quanto maior a pontuação, maior a chance de se receber uma sentença mais severa.

What's going on? Data laundering. It's a process by which technologists hide ugly truths inside black box algorithms and call them objective; call them meritocratic. When they're secret, important and destructive, I've coined a term for these algorithms: "weapons of math destruction."

O que que está havendo? Branqueamento dos dados. É um processo por meio do qual tecnólogos escondem verdades sujas dentro da caixa-preta dos algoritmos, e os chamam de objetivos, de meritocráticos. Cunhei um termo para esses algoritmos secretos, importantes e destrutivos: "armas de destruição em matemática".

(Laughter)

(Aplausos) (Vivas)

(Applause)

Eles estão por toda parte, e isso não é um erro.

They're everywhere, and it's not a mistake. These are private companies building private algorithms for private ends. Even the ones I talked about for teachers and the public police, those were built by private companies and sold to the government institutions. They call it their "secret sauce" -- that's why they can't tell us about it. It's also private power. They are profiting for wielding the authority of the inscrutable. Now you might think, since all this stuff is private and there's competition, maybe the free market will solve this problem. It won't. There's a lot of money to be made in unfairness.

Trata-se de empresas privadas criando algoritmos privados para fins privados. Mesmos aqueles que mencionei, para os professores e a polícia, foram criados por empresas privadas e vendidos a instituições governamentais. Eles os chamam de seu "molho secreto", e por isso não nos contam sobre eles. Isso é poder privado também. Eles estão lucrando para exercerem a autoridade do inescrutável. Vocês podem achar, já que isso é privado e não há competição, que talvez o livre comércio resolva o problema. Não vai resolver. Há muito dinheiro a ser ganho com a injustiça.

Also, we're not economic rational agents. We all are biased. We're all racist and bigoted in ways that we wish we weren't, in ways that we don't even know. We know this, though, in aggregate, because sociologists have consistently demonstrated this with these experiments they build, where they send a bunch of applications to jobs out, equally qualified but some have white-sounding names and some have black-sounding names, and it's always disappointing, the results -- always.

Além disso, não somos agentes econômicos racionais. Somos todos tendenciosos. Somos todos racistas e intolerantes de maneiras que desejávamos não ser, de maneiras das nem temos consciência. No entanto, sabemos disso porque os sociólogos têm demonstrado isso consistentemente com experimentos nos quais enviam um monte de currículos, todos igualmente qualificados, mas alguns com nomes que parecem ser de brancos, e outros, de negros, e os resultados são sempre frustrantes.

So we are the ones that are biased, and we are injecting those biases into the algorithms by choosing what data to collect, like I chose not to think about ramen noodles -- I decided it was irrelevant. But by trusting the data that's actually picking up on past practices and by choosing the definition of success, how can we expect the algorithms to emerge unscathed? We can't. We have to check them. We have to check them for fairness.

Então, nós somos tendenciosos, e estamos instilando esses preconceitos nos algoritmos quando escolhemos quais dados coletar, como quando escolhi descartar o Miojo, porque decidi que ele era irrelevante. Mas, ao confiar em dados que se baseiam em práticas do passado e ao escolher a definição de sucesso, como podemos esperar que os algoritmos saiam incólumes? Não dá, temos de fiscalizá-los. Temos de checar se são justos.

The good news is, we can check them for fairness. Algorithms can be interrogated, and they will tell us the truth every time. And we can fix them. We can make them better. I call this an algorithmic audit, and I'll walk you through it.

A boa notícia é que isso é possível. Os algoritmos podem ser questionados, e eles sempre vão nos dizer a verdade. E podemos repará-los, aperfeiçoá-los. Podemos chamar de auditoria de algoritmos, e vou mostrar como seria.

First, data integrity check. For the recidivism risk algorithm I talked about, a data integrity check would mean we'd have to come to terms with the fact that in the US, whites and blacks smoke pot at the same rate but blacks are far more likely to be arrested -- four or five times more likely, depending on the area. What is that bias looking like in other crime categories, and how do we account for it?

Primeiro, temos de checar a integridade dos dados. Para o algoritmo de risco de recidiva que mencionei, checar a integridade dos dados significa aceitarmos o fato de que, nos EUA, brancos e negros fumam maconha na mesma proporção, mas os negros têm muito mais chance de serem presos, quatro ou cinco vezes mais, dependendo da região. E como esse viés surge em outras categorias de crime e como justificamos isso?

Second, we should think about the definition of success, audit that. Remember -- with the hiring algorithm? We talked about it. Someone who stays for four years and is promoted once? Well, that is a successful employee, but it's also an employee that is supported by their culture. That said, also it can be quite biased. We need to separate those two things. We should look to the blind orchestra audition as an example. That's where the people auditioning are behind a sheet. What I want to think about there is the people who are listening have decided what's important and they've decided what's not important, and they're not getting distracted by that. When the blind orchestra auditions started, the number of women in orchestras went up by a factor of five.

Segundo, devemos pensar na definição de sucesso, auditar esse conceito. Lembram-se do algoritmo de contratação de que falei? Alguém que trabalhou por quatro anos e foi promovido uma vez? Bem, esse é um empregado de sucesso, mas é também um empregado que tem apoio da cultura da empresa. Isso pode ser bem tendencioso. Precisamos separar essas duas coisas. Deveríamos nos mirar na audição às cegas de orquestras. É quando os examinadores ficam atrás de uma planilha. O importante aí é que os examinadores decidem o que é importante e o que não é, e não se distraem com outras coisas. Quando as audições às cegas de orquestras começaram, o número de mulheres em orquestras cresceu cinco vezes mais.

Next, we have to consider accuracy. This is where the value-added model for teachers would fail immediately. No algorithm is perfect, of course, so we have to consider the errors of every algorithm. How often are there errors, and for whom does this model fail? What is the cost of that failure?

Depois, temos de considerar o rigor. É aí que o modelo valor agregado para professores fracassaria imediatamente. Nenhum algoritmo é perfeito, claro, assim, temos de partir do pressuposto de que todos erram. Qual a frequência desses erros, e com quem esse modelo falha? Qual o preço desse fracasso?

And finally, we have to consider the long-term effects of algorithms, the feedback loops that are engendering. That sounds abstract, but imagine if Facebook engineers had considered that before they decided to show us only things that our friends had posted.

E, finalmente, temos de considerar os efeitos de longo prazo dos algoritmos, os círculos viciosos que são gerados. Isso parece abstrato, mas imaginem se os engenheiros do Facebook tivessem considerado isso antes de decidirem nos mostrar apenas coisas que nossos amigos postam.

I have two more messages, one for the data scientists out there. Data scientists: we should not be the arbiters of truth. We should be translators of ethical discussions that happen in larger society.

Tenho mais duas mensagens, uma para os cientistas de dados. Cientistas de dados: não devemos ser os árbitros da verdade. Devemos ser tradutores dos debates éticos que ocorrem na sociedade como um todo.

(Applause)

(Aplausos) (Vivas)

And the rest of you, the non-data scientists: this is not a math test. This is a political fight. We need to demand accountability for our algorithmic overlords.

E os demais, os que não são cientistas de dados: isso não é um teste de matemática. Essa é uma luta política. Precisamos exigir prestação de contas dos "senhores dos algoritmos".

(Applause)

(Aplausos) (Vivas)

The era of blind faith in big data must end.

A era da fé cega no Big Data tem de acabar.

Thank you very much.

Muito obrigada.

(Applause)

(Aplausos) (Vivas)

(Laughter)

(Risos)

(Laughter)

(Risos)

What is that?

O que é isto?

(Laughter)

(Risos)

That should never have been used for individual assessment. It's almost a random number generator.

Isso nunca deveria ter sido usado numa avaliação individual. É quase um gerador aleatório de número.

(Applause)

(Aplausos) (Vivas)

This is Roger Ailes.

Este é Roger Ailes.

(Laughter)

(Risos)

(Laughter)

(Aplausos) (Vivas)

(Applause)

Eles estão por toda parte, e isso não é um erro.

(Applause)

(Aplausos) (Vivas)

And the rest of you, the non-data scientists: this is not a math test. This is a political fight. We need to demand accountability for our algorithmic overlords.

E os demais, os que não são cientistas de dados: isso não é um teste de matemática. Essa é uma luta política. Precisamos exigir prestação de contas dos "senhores dos algoritmos".

(Applause)

(Aplausos) (Vivas)

The era of blind faith in big data must end.

A era da fé cega no Big Data tem de acabar.

Thank you very much.

Muito obrigada.

(Applause)

(Aplausos) (Vivas)

Cathy O'Neil: The era of blind faith in big data must end

Cathy O'Neil: The era of blind faith in big data must end

Related talks

Tricia Wang: The human insights missing from big data

Mona Chalabi: 3 ways to spot a bad statistic

Mallory Freeman: Your company's data could help end world hunger

Christian Rudder: Inside OKCupid: The math of online dating

Zeynep Tufekci: Machine intelligence makes human morals more important

Amy Webb: How I hacked online dating

Related talks

Tricia Wang: The human insights missing from big data

Mona Chalabi: 3 ways to spot a bad statistic

Mallory Freeman: Your company's data could help end world hunger

Christian Rudder: Inside OKCupid: The math of online dating

Zeynep Tufekci: Machine intelligence makes human morals more important

Amy Webb: How I hacked online dating