Peter Donnelly: How juries are fooled by statistics

As other speakers have said, it's a rather daunting experience -- a particularly daunting experience -- to be speaking in front of this audience. But unlike the other speakers, I'm not going to tell you about the mysteries of the universe, or the wonders of evolution, or the really clever, innovative ways people are attacking the major inequalities in our world. Or even the challenges of nation-states in the modern global economy. My brief, as you've just heard, is to tell you about statistics -- and, to be more precise, to tell you some exciting things about statistics. And that's -- (Laughter) -- that's rather more challenging than all the speakers before me and all the ones coming after me. (Laughter) One of my senior colleagues told me, when I was a youngster in this profession, rather proudly, that statisticians were people who liked figures but didn't have the personality skills to become accountants. (Laughter) And there's another in-joke among statisticians, and that's, "How do you tell the introverted statistician from the extroverted statistician?" To which the answer is, "The extroverted statistician's the one who looks at the other person's shoes." (Laughter) But I want to tell you something useful -- and here it is, so concentrate now. This evening, there's a reception in the University's Museum of Natural History. And it's a wonderful setting, as I hope you'll find, and a great icon to the best of the Victorian tradition. It's very unlikely -- in this special setting, and this collection of people -- but you might just find yourself talking to someone you'd rather wish that you weren't. So here's what you do. When they say to you, "What do you do?" -- you say, "I'm a statistician." (Laughter) Well, except they've been pre-warned now, and they'll know you're making it up. And then one of two things will happen. They'll either discover their long-lost cousin in the other corner of the room and run over and talk to them. Or they'll suddenly become parched and/or hungry -- and often both -- and sprint off for a drink and some food. And you'll be left in peace to talk to the person you really want to talk to.

Como outros oradores já disseram, é uma experiência assustadora falar em frente desta audiência. Mas, ao contrário dos outros oradores, não vou falar dos mistérios do universo nem das maravilhas da evolução, nem das formas inteligentes e inovadoras com que as pessoas estão a atacar as desigualdades no nosso mundo. Nem mesmo dos problemas das nações-estados na moderna economia mundial. Como já ouviram dizer, vou falar de estatísticas e, para ser mais preciso, vou falar de coisas apaixonantes na estatística. (Risos) Para mim, é muito mais desafiador, do que todos os oradores antes de mim e dos que vierem depois. (Risos) Um dos meus colegas mais velhos disse-me, quando eu era novato nesta profissão, orgulhosamente, que os estatísticos eram pessoas que gostavam de números, mas não tinham a personalidade necessária para serem contabilistas. (Risos) Há uma outra piada, entre os estatísticos, que é: "Qual a diferença entre um estatístico introvertido e um extrovertido?" E a resposta é: "O estatístico extrovertido é aquele que olha para os sapatos dos outros". (Risos) Mas eu quero falar-vos de uma coisa útil — cá vai, por isso concentrem-se. Esta noite, há uma receção no Museu de História Natural da Universidade. É um ambiente magnífico, espero que apreciem, é um muito representativo do melhor da tradição vitoriana. É pouco provável — neste ambiente especial e com este grupo de pessoas — mas talvez se encontrem a falar com alguém com quem preferiam não falar. Nesse caso, façam o seguinte: Quando vos perguntarem: "O que é que faz?", digam: "Sou estatístico". (Risos) O pior é que agora já estão avisados e sabem que vocês estão a fingir. Então, acontece uma destas duas coisas. Ou descobrem um primo distante no outro canto da sala e vão a correr falar com ele, ou ficam com sede ou fome, ou as duas coisas, e vão à procura duma bebida e de qualquer coisa para comer. E vocês ficam à vontade para falarem com a pessoa que pretendem.

It's one of the challenges in our profession to try and explain what we do. We're not top on people's lists for dinner party guests and conversations and so on. And it's something I've never really found a good way of doing. But my wife -- who was then my girlfriend -- managed it much better than I've ever been able to. Many years ago, when we first started going out, she was working for the BBC in Britain, and I was, at that stage, working in America. I was coming back to visit her. She told this to one of her colleagues, who said, "Well, what does your boyfriend do?" Sarah thought quite hard about the things I'd explained -- and she concentrated, in those days, on listening. (Laughter) Don't tell her I said that. And she was thinking about the work I did developing mathematical models for understanding evolution and modern genetics. So when her colleague said, "What does he do?" She paused and said, "He models things." (Laughter) Well, her colleague suddenly got much more interested than I had any right to expect and went on and said, "What does he model?" Well, Sarah thought a little bit more about my work and said, "Genes." (Laughter) "He models genes."

É um dos problemas na nossa profissão tentar explicar o que fazemos. Não somos os melhores convivas para jantares nem para conversas. E é uma coisa para que nunca tive muito jeito Mas a minha mulher — que, na altura, era minha namorada — conseguia-o muito melhor do que eu. Há muitos anos, quando começámos a sair, ela trabalhava para a BBC, na Grã-Bretanha e eu, naquela altura, trabalhava nos EUA. Eu tinha vindo visitá-la. Ela disse isso a uma colega, que perguntou: "O que é que o teu namorado faz?" Sarah pôs-se a refletir nas coisas que eu lhe contava — naquele tempo, fazia um esforço para me escutar. (Risos) Não lhe digam que eu contei isto. Pensava no trabalho que eu fazia a desenvolver modelos matemáticos para compreender a evolução e a genética moderna. Por isso, quando a colega perguntou: "O que é que ele faz?" ela respondeu: "Modela coisas". (Risos) A colega ficou muito mais interessada do que eu tinha o direito a esperar e continuou a perguntar: "O que é que ele modela?" Sarah pensou um pouco mais no meu trabalho e disse: "Genes". (Risos) "Modela genes".

That is my first love, and that's what I'll tell you a little bit about. What I want to do more generally is to get you thinking about the place of uncertainty and randomness and chance in our world, and how we react to that, and how well we do or don't think about it. So you've had a pretty easy time up till now -- a few laughs, and all that kind of thing -- in the talks to date. You've got to think, and I'm going to ask you some questions. So here's the scene for the first question I'm going to ask you. Can you imagine tossing a coin successively? And for some reason -- which shall remain rather vague -- we're interested in a particular pattern. Here's one -- a head, followed by a tail, followed by a tail.

É esse o meu primeiro amor, e não vos digo mais nada sobre isso. O que eu queria, de modo geral, é pôr-vos a pensar no papel da incerteza, do aleatório e do acaso, no nosso mundo, e como reagimos a isso, como pensamos nisso, bem ou mal. Vocês já passaram um bom bocado, até agora — umas gargalhadas, e isso tudo — nas palestras até aqui. Vão ter que pensar e eu vou fazer umas perguntas. Este é o cenário para a primeira pergunta que vou fazer. Imaginem uma moeda a ser lançada ao ar, sucessivamente. Por alguma razão — que se vai manter bastante vaga — estamos interessados num padrão especial. É este — uma cara, seguida por uma coroa, seguida por uma coroa

So suppose we toss a coin repeatedly. Then the pattern, head-tail-tail, that we've suddenly become fixated with happens here. And you can count: one, two, three, four, five, six, seven, eight, nine, 10 -- it happens after the 10th toss. So you might think there are more interesting things to do, but humor me for the moment. Imagine this half of the audience each get out coins, and they toss them until they first see the pattern head-tail-tail. The first time they do it, maybe it happens after the 10th toss, as here. The second time, maybe it's after the fourth toss. The next time, after the 15th toss. So you do that lots and lots of times, and you average those numbers. That's what I want this side to think about.

Suponham que atiram uma moeda ao ar, repetidas vezes. Então, o padrão caras-coroa-coroa, que escolhemos, ocorre aqui. Podemos contar: um, dois, três, quatro, cinco, seis, sete, oito, nove, dez — acontece após o 10.º lançamento. Podem pensar que há coisas mais interessantes para fazer, mas tenham paciência. Imaginem que esta metade da audiência tem moedas e as lançam ao ar, até que veem o padrão cara-coroa-coroa. Pode acontecer que a primeira vez ocorra depois do 10.º lançamento. A segunda vez talvez após o quarto. A vez seguinte, após o 15.º. Fazemos muitos lançamentos e tiramos a média a esses números.

The other half of the audience doesn't like head-tail-tail -- they think, for deep cultural reasons, that's boring -- and they're much more interested in a different pattern -- head-tail-head. So, on this side, you get out your coins, and you toss and toss and toss. And you count the number of times until the pattern head-tail-head appears and you average them. OK? So on this side, you've got a number -- you've done it lots of times, so you get it accurately -- which is the average number of tosses until head-tail-tail. On this side, you've got a number -- the average number of tosses until head-tail-head.

É nisso que eu quero que este lado pense. A outra metade da audiência não gosta de cara-coroa-coroa. Por profundas razões culturais, pensam que é aborrecida. Estão mais interessados noutro padrão — cara-coroa-cara. Deste lado, recebem as moedas e começam os lançamentos. Contam o número de vezes, até aparecer o padrão cara-coroa-cara e fazem a média. Ok? Então, deste lado, têm um número — fizeram isso montes de vezes, têm um número exato — que é o número médio de lançamentos até cara-coroa-coroa.

So here's a deep mathematical fact -- if you've got two numbers, one of three things must be true. Either they're the same, or this one's bigger than this one, or this one's bigger than that one. So what's going on here? So you've all got to think about this, and you've all got to vote -- and we're not moving on. And I don't want to end up in the two-minute silence to give you more time to think about it, until everyone's expressed a view. OK. So what you want to do is compare the average number of tosses until we first see head-tail-head with the average number of tosses until we first see head-tail-tail.

Deste lado, têm o número médio de lançamentos até cara-coroa-cara. Há uma regra matemática fundamental — se já têm os dois números, uma de três coisas é verdade. Ou são iguais, ou este é maior do que este, ou este é maior do que este. O que se passa aqui? Agora vão ter que pensar nisto e vão ter que votar e daqui não saímos. Não quero dar-vos tempo demais para refletir nisso, enquanto não tiverem votado todos. Comparem o número médio de lançamentos até termos visto cara-coroa-coroa, com o número médio de lançamentos até termos visto cara-coroa-cara.

Who thinks that A is true -- that, on average, it'll take longer to see head-tail-head than head-tail-tail? Who thinks that B is true -- that on average, they're the same? Who thinks that C is true -- that, on average, it'll take less time to see head-tail-head than head-tail-tail? OK, who hasn't voted yet? Because that's really naughty -- I said you had to. (Laughter) OK. So most people think B is true. And you might be relieved to know even rather distinguished mathematicians think that. It's not. A is true here. It takes longer, on average. In fact, the average number of tosses till head-tail-head is 10 and the average number of tosses until head-tail-tail is eight. How could that be? Anything different about the two patterns? There is. Head-tail-head overlaps itself. If you went head-tail-head-tail-head, you can cunningly get two occurrences of the pattern in only five tosses. You can't do that with head-tail-tail. That turns out to be important.

Quem acha que A é verdadeiro, que, em média, é mais demorado ver caras-coroa-caras do que cara-coroa-coroa? Quem pensa que B é verdadeiro, que, em média, são iguais? Quem pensa que C é verdadeiro, que, em média, é menos demorado ver cara-coroa-cara do que cara-coroa-coroa? Quem é que não votou? Isso é muito chato, eu disse que tinham que votar. (Risos) A maior parte das pessoas acha que B é que é verdade. Podem ficar tranquilos porque eminentes matemáticos pensam o mesmo. Mas não é. É A que é verdade, Demora mais tempo, em média. O número médio de lançamentos até cara-coroa-cara é 10 e o número médio de lançamentos até cara-coroa-coroa é 8. Como é que isso pode ser? Há alguma diferença entre os dois padrões? Há sim. Em cara-coroa-cara a última é igual à primeira. Se fizessem cara-coroa-cara-coroa-cara, podiam obter facilmente duas ocorrências do padrão apenas em cinco lançamentos. Não podem fazer isso com cara-coroa-coroa. Acontece que isto é importante.

There are two ways of thinking about this. I'll give you one of them. So imagine -- let's suppose we're doing it. On this side -- remember, you're excited about head-tail-tail; you're excited about head-tail-head. We start tossing a coin, and we get a head -- and you start sitting on the edge of your seat because something great and wonderful, or awesome, might be about to happen. The next toss is a tail -- you get really excited. The champagne's on ice just next to you; you've got the glasses chilled to celebrate. You're waiting with bated breath for the final toss. And if it comes down a head, that's great. You're done, and you celebrate. If it's a tail -- well, rather disappointedly, you put the glasses away and put the champagne back. And you keep tossing, to wait for the next head, to get excited.

Há duas formas de pensar nisto. Vou dar-vos uma delas. Suponhamos que começamos os lançamentos. Lembrem-se, deste lado estão obcecados com cara-coroa-coroa, e vocês com cara-coroa-cara. Começamos os lançamentos e sai cara. Vocês retorcem-se na cadeira, porque pode estar para acontecer uma coisa maravilhosa, espantosa. O lançamento seguinte é coroa — vocês ficam entusiasmadíssimos. O champanhe no gelo está perto, os copos estão gelados, para a festa. Estão à espera, de respiração suspensa, do lançamento final. Se for cara, é ótimo. Ficam contentes e é uma festa. Se for coroa, desiludidos, põem os copos de lado e mandam embora o champanhe. Continuam os lançamentos à espera que saia cara.

On this side, there's a different experience. It's the same for the first two parts of the sequence. You're a little bit excited with the first head -- you get rather more excited with the next tail. Then you toss the coin. If it's a tail, you crack open the champagne. If it's a head you're disappointed, but you're still a third of the way to your pattern again. And that's an informal way of presenting it -- that's why there's a difference. Another way of thinking about it -- if we tossed a coin eight million times, then we'd expect a million head-tail-heads and a million head-tail-tails -- but the head-tail-heads could occur in clumps. So if you want to put a million things down amongst eight million positions and you can have some of them overlapping, the clumps will be further apart. It's another way of getting the intuition.

Deste lado, há uma experiência diferente. É o mesmo nas duas primeiras partes da sequência. Ficam um pouco entusiasmados com a primeira cara — ficam mais entusiasmados com a coroa a seguir. Depois lançam a moeda. Se for coroa, podem abrir o champanhe. Se for cara, ficam desiludidos mas voltam a estar a um terço do caminho do vosso padrão. Esta é uma maneira informal de apresentar — é por isso que há uma diferença. Outra forma de pensar nisso. Se lançarmos uma moeda oito milhões de vezes, podemos esperar um milhão de cara-coroa-cara e um milhão de cara-coroa-coroa mas cara-coroa-cara pode ocorrer em cachos. Portanto, se quiserem pôr um milhão de coisas entre oito milhões de posições e podemos ter algumas que se sobrepõem, os cachos estarão mais separados. É outra forma de tornarmos as coisas intuitivas.

What's the point I want to make? It's a very, very simple example, an easily stated question in probability, which every -- you're in good company -- everybody gets wrong. This is my little diversion into my real passion, which is genetics. There's a connection between head-tail-heads and head-tail-tails in genetics, and it's the following. When you toss a coin, you get a sequence of heads and tails. When you look at DNA, there's a sequence of not two things -- heads and tails -- but four letters -- As, Gs, Cs and Ts. And there are little chemical scissors, called restriction enzymes which cut DNA whenever they see particular patterns. And they're an enormously useful tool in modern molecular biology. And instead of asking the question, "How long until I see a head-tail-head?" -- you can ask, "How big will the chunks be when I use a restriction enzyme which cuts whenever it sees G-A-A-G, for example? How long will those chunks be?"

Onde é que eu quero chegar? É um exemplo muito simples, uma questão de probabilidade fácil em que toda a gente se engana — e vocês estão bem acompanhados. Esta é um pequeno desvio para chegar à minha paixão, que é a genética. Há uma ligação entre cara-coroa-cara e cara-coroa-coroa, na genética. que é a seguinte. Quando lançamos uma moeda, obtemos uma sequência de caras e coroas. No ADN, há uma sequência que não é de duas coisas — caras e coroas — mas de quatro letras — A, G, C e T. E há pequenas tesouras químicas, chamadas enzimas de restrição que cortam o ADN sempre que veem determinados padrões. É uma ferramenta extremamente útil na moderna biologia molecular. Não perguntem: "Ao fim de quanto tempo ocorre uma cara-coroa-cara?" perguntem: "De que tamanho são os cachos se usar uma enzima de restrição "que corta quando vê G-A-A-G, por exemplo?" "De que tamanho serão esses cachos?"

That's a rather trivial connection between probability and genetics. There's a much deeper connection, which I don't have time to go into and that is that modern genetics is a really exciting area of science. And we'll hear some talks later in the conference specifically about that. But it turns out that unlocking the secrets in the information generated by modern experimental technologies, a key part of that has to do with fairly sophisticated -- you'll be relieved to know that I do something useful in my day job, rather more sophisticated than the head-tail-head story -- but quite sophisticated computer modelings and mathematical modelings and modern statistical techniques. And I will give you two little snippets -- two examples -- of projects we're involved in in my group in Oxford, both of which I think are rather exciting. You know about the Human Genome Project. That was a project which aimed to read one copy of the human genome. The natural thing to do after you've done that -- and that's what this project, the International HapMap Project, which is a collaboration between labs in five or six different countries. Think of the Human Genome Project as learning what we've got in common, and the HapMap Project is trying to understand where there are differences between different people.

É uma ligação bastante trivial entre probabilidade e genética. Há uma ligação muito mais profunda mas não tenho tempo para entrar nela que faz da genética moderna uma área da ciência apaixonante. Vamos ouvir palestras nesta conferência especificamente sobre isso. Mas acontece que, para desbloquear os segredos nos dados gerados pelas modernas tecnologias experimentais, uma parte fundamental tem a ver — tranquilizem-se, eu faço algo de útil no meu trabalho — com tecnologias mais sofisticada do que a história de cara-coroa-cara, com sofisticados modelos informáticos e matemáticos e técnicas estatísticas modernas. Vou dar-vos dois pequenos fragmentos — dois exemplos — de projetos em que estamos envolvidos no meu grupo em Oxford, que eu considero apaixonantes. Vocês conhecem o Projeto do Genoma Humano. Foi um projeto que pretendia ler uma cópia do genoma humano. A coisa natural a fazer, depois de fazer aquilo é Projeto o Internacional HapMap, que é uma colaboração entre laboratórios de cinco ou seis países diferentes. Pensem no Projeto do Genoma Humano como aprender o que temos em comum e no Projeto HapMao como tentar compreender onde há diferenças entre pessoas diferentes.

Why do we care about that? Well, there are lots of reasons. The most pressing one is that we want to understand how some differences make some people susceptible to one disease -- type-2 diabetes, for example -- and other differences make people more susceptible to heart disease, or stroke, or autism and so on. That's one big project. There's a second big project, recently funded by the Wellcome Trust in this country, involving very large studies -- thousands of individuals, with each of eight different diseases, common diseases like type-1 and type-2 diabetes, and coronary heart disease, bipolar disease and so on -- to try and understand the genetics. To try and understand what it is about genetic differences that causes the diseases. Why do we want to do that? Because we understand very little about most human diseases. We don't know what causes them. And if we can get in at the bottom and understand the genetics, we'll have a window on the way the disease works, and a whole new way about thinking about disease therapies and preventative treatment and so on. So that's, as I said, the little diversion on my main love.

Porque é que nos preocupamos com isso? Há muitas razões. A mais importante é que queremos perceber como essas diferenças podem tornar as pessoas suscetíveis a uma doença — digamos, diabetes tipo 2 — e outras diferenças tornam as pessoas mais suscetíveis a doenças cardíacas, ou AVC, ou autismo, etc. É um projeto importante. Há outro projeto importante, recém-financiado pela Welcome Trust neste país, que envolve estudos muito amplos — milhares de indivíduos, com uma de oito doenças diferentes, doenças vulgares como diabetes tipo 1 e tipo 2, doenças coronárias, doença bipolar, etc. — para tentar perceber a genética. Tentar perceber quais são as diferenças genéticas que causam as doenças. Porque é que queremos fazer isso? Porque sabemos muito pouco sobre a maior parte das doenças humanas. Não sabemos o que é que as provoca. Aprofundando e compreendendo a genética, abrimos uma janela para a forma como funciona a doença e uma nova forma de pensar nas terapias para as doenças e no tratamento preventivo. Este foi o desvio de que falei, à minha principal paixão.

Back to some of the more mundane issues of thinking about uncertainty. Here's another quiz for you -- now suppose we've got a test for a disease which isn't infallible, but it's pretty good. It gets it right 99 percent of the time. And I take one of you, or I take someone off the street, and I test them for the disease in question. Let's suppose there's a test for HIV -- the virus that causes AIDS -- and the test says the person has the disease. What's the chance that they do? The test gets it right 99 percent of the time. So a natural answer is 99 percent. Who likes that answer? Come on -- everyone's got to get involved. Don't think you don't trust me anymore. (Laughter) Well, you're right to be a bit skeptical, because that's not the answer. That's what you might think. It's not the answer, and it's not because it's only part of the story. It actually depends on how common or how rare the disease is. So let me try and illustrate that. Here's a little caricature of a million individuals. So let's think about a disease that affects -- it's pretty rare, it affects one person in 10,000. Amongst these million individuals, most of them are healthy and some of them will have the disease. And in fact, if this is the prevalence of the disease, about 100 will have the disease and the rest won't. So now suppose we test them all. What happens? Well, amongst the 100 who do have the disease, the test will get it right 99 percent of the time, and 99 will test positive. Amongst all these other people who don't have the disease, the test will get it right 99 percent of the time. It'll only get it wrong one percent of the time. But there are so many of them that there'll be an enormous number of false positives. Put that another way -- of all of them who test positive -- so here they are, the individuals involved -- less than one in 100 actually have the disease. So even though we think the test is accurate, the important part of the story is there's another bit of information we need.

Voltemos às questões mais prosaicas de pensar na incerteza. Eis outro questionário. Suponhamos que temos um teste para uma doença que não é infalível, mas é muito bom. Acerta em cheio 99% das vezes. Escolho um de vocês, ou alguém no meio da rua e faço-lhe o teste para a doença em questão. Suponhamos que é um teste para o VIH — o vírus que provoca a SIDA — e o teste diz que a pessoa tem essa doença. Qual é a probabilidade que ela a tenha? O teste acerta em cheio 99% das vezes. Portanto, a resposta natural é 88%. Quem é que gosta desta resposta? Vá, todos têm que participar. Esqueçam que já não confiam em mim. (Risos) Têm razão em estar um pouco céticos, porque a resposta não é essa. Essa é a que se pode pensar. Mas não é a resposta, e não é porque seja apenas parte da história. Mas depende de a doença ser vulgar ou rara. Vou tentar ilustrar isso. Isto é uma representação de um milhão de indivíduos. Pensemos numa doença que seja rara que afete uma pessoa em 10 000. Neste milhão de indivíduos, a maioria é saudável e alguns deles terão a doença. De facto, se for esta a prevalência da doença, cerca de 100 terão a doença e os restantes não. Agora suponhamos que os testamos a todos. O que acontece? Entre os 100 que têm a doença, o teste acertará 99% das vezes e 99 terão um resultado positivo. Entre as outras pessoas que não têm a doença, o teste acertará 99% das vezes. Só estará errado 1% das vezes. Mas há tanta gente que o número de falsos positivos será enorme. Por outras palavras, de todos os que obtiverem positivo — estes são os indivíduos envolvido — menos de um em 100 têm mesmo a doença. Portanto, apesar de pensarmos que o teste é rigoroso, a parte importante da história é que precisamos de mais informações.

Here's the key intuition. What we have to do, once we know the test is positive, is to weigh up the plausibility, or the likelihood, of two competing explanations. Each of those explanations has a likely bit and an unlikely bit. One explanation is that the person doesn't have the disease -- that's overwhelmingly likely, if you pick someone at random -- but the test gets it wrong, which is unlikely. The other explanation is that the person does have the disease -- that's unlikely -- but the test gets it right, which is likely. And the number we end up with -- that number which is a little bit less than one in 100 -- is to do with how likely one of those explanations is relative to the other. Each of them taken together is unlikely.

Esta é a intuição fundamental. Quando sabemos que o teste é positivo, temos que avaliar a plausibilidade de duas explicações opostas. Cada uma dessas explicações é, em parte, provável e, em parte, improvável. Uma das explicações é que a pessoa não tem a doença — que é uma probabilidade forte, se agarrarmos numa pessoa ao acaso — mas o teste está errado, o que é pouco provável. A outra explicação é que a pessoa tem a doença — o que é improvável — mas o teste está correto, o que é provável. E acabamos com um número — um pouco menor do que um em 100 — que depende da probabilidade de uma das explicações estar ligada à outra. Cada uma delas, consideradas em conjunto, é pouco provável.

Here's a more topical example of exactly the same thing. Those of you in Britain will know about what's become rather a celebrated case of a woman called Sally Clark, who had two babies who died suddenly. And initially, it was thought that they died of what's known informally as "cot death," and more formally as "Sudden Infant Death Syndrome." For various reasons, she was later charged with murder. And at the trial, her trial, a very distinguished pediatrician gave evidence that the chance of two cot deaths, innocent deaths, in a family like hers -- which was professional and non-smoking -- was one in 73 million. To cut a long story short, she was convicted at the time. Later, and fairly recently, acquitted on appeal -- in fact, on the second appeal. And just to set it in context, you can imagine how awful it is for someone to have lost one child, and then two, if they're innocent, to be convicted of murdering them. To be put through the stress of the trial, convicted of murdering them -- and to spend time in a women's prison, where all the other prisoners think you killed your children -- is a really awful thing to happen to someone. And it happened in large part here because the expert got the statistics horribly wrong, in two different ways.

Este é um exemplo mais atual, exatamente da mesma coisa. Os britânicos aqui conhecem um caso que ficou célebre duma mulher chamada Sally Clark, cujos dois bebés morreram subitamente. Inicialmente, pensou-se que eles tinham morrido de "morte súbita", ou seja, "Síndrome da morte súbita infantil". Por diversas razões, foi depois acusada de homicídio. No julgamento dela, um pediatra de renome testemunhou que a possibilidade de duas mortes súbitas de bebés, numa família como a dela — que trabalhava e não era fumadora — era de uma em 73 milhões. Para abreviar, ela foi condenada. Mais tarde, há pouco tempo, foi absolvida num segundo apelo. Só para contextualizar, podem imaginar como deve ser terrível perder um filho, ainda por cima dois, se estiverem inocentes, e ser condenada por tê-los matado. Sofrer a tensão do julgamento, ser condenada por homicídio, estar numa prisão, onde as prisioneiras pensam que matara os filhos — é terrível acontecer isso a alguém. E aconteceu, em grande parte, porque o especialista interpretou mal as estatísticas, de duas formas diferentes.

So where did he get the one in 73 million number? He looked at some research, which said the chance of one cot death in a family like Sally Clark's is about one in 8,500. So he said, "I'll assume that if you have one cot death in a family, the chance of a second child dying from cot death aren't changed." So that's what statisticians would call an assumption of independence. It's like saying, "If you toss a coin and get a head the first time, that won't affect the chance of getting a head the second time." So if you toss a coin twice, the chance of getting a head twice are a half -- that's the chance the first time -- times a half -- the chance a second time. So he said, "Here, I'll assume that these events are independent. When you multiply 8,500 together twice, you get about 73 million." And none of this was stated to the court as an assumption or presented to the jury that way. Unfortunately here -- and, really, regrettably -- first of all, in a situation like this you'd have to verify it empirically. And secondly, it's palpably false. There are lots and lots of things that we don't know about sudden infant deaths. It might well be that there are environmental factors that we're not aware of, and it's pretty likely to be the case that there are genetic factors we're not aware of. So if a family suffers from one cot death, you'd put them in a high-risk group. They've probably got these environmental risk factors and/or genetic risk factors we don't know about. And to argue, then, that the chance of a second death is as if you didn't know that information is really silly. It's worse than silly -- it's really bad science. Nonetheless, that's how it was presented, and at trial nobody even argued it. That's the first problem. The second problem is, what does the number of one in 73 million mean? So after Sally Clark was convicted -- you can imagine, it made rather a splash in the press -- one of the journalists from one of Britain's more reputable newspapers wrote that what the expert had said was, "The chance that she was innocent was one in 73 million." Now, that's a logical error. It's exactly the same logical error as the logical error of thinking that after the disease test, which is 99 percent accurate, the chance of having the disease is 99 percent. In the disease example, we had to bear in mind two things, one of which was the possibility that the test got it right or not. And the other one was the chance, a priori, that the person had the disease or not. It's exactly the same in this context. There are two things involved -- two parts to the explanation. We want to know how likely, or relatively how likely, two different explanations are. One of them is that Sally Clark was innocent -- which is, a priori, overwhelmingly likely -- most mothers don't kill their children. And the second part of the explanation is that she suffered an incredibly unlikely event. Not as unlikely as one in 73 million, but nonetheless rather unlikely. The other explanation is that she was guilty. Now, we probably think a priori that's unlikely. And we certainly should think in the context of a criminal trial that that's unlikely, because of the presumption of innocence. And then if she were trying to kill the children, she succeeded. So the chance that she's innocent isn't one in 73 million. We don't know what it is. It has to do with weighing up the strength of the other evidence against her and the statistical evidence. We know the children died. What matters is how likely or unlikely, relative to each other, the two explanations are. And they're both implausible. There's a situation where errors in statistics had really profound and really unfortunate consequences. In fact, there are two other women who were convicted on the basis of the evidence of this pediatrician, who have subsequently been released on appeal. Many cases were reviewed. And it's particularly topical because he's currently facing a disrepute charge at Britain's General Medical Council.

Onde é que ele foi buscar o número de um em 73 milhões? Leu alguma investigação que dizia que a probabilidade de uma morte súbita numa família como a de Sally Clark era de uma em 8500. E disse: "Parto do princípio que, se há uma morte súbita numa família, "a hipótese de uma segunda criança morrer de morte súbita, não muda". É o que os estatísticos chamam de "presunção de independência". É como dizer: "Se numa moeda ao ar sair caras à primeira", "isso não afeta a hipótese de obter caras à segunda vez". Assim, se lançarmos uma moeda duas vezes seguidas, a hipótese de sair caras duas vezes é 50% — da primeira vez — vezes 50% — da segunda vez. Então, ele disse: "Parto do princípio que são dois acontecimentos independentes. "Quando multiplicamos 8500 por 8500, "obtemos 73 milhões". Nada disto foi explicado ao tribunal, como uma presunção nem apresentado ao júri dessa forma. Infelizmente — e lamentavelmente — primeiro que tudo, numa situação destas, devia ter sido verificada empiricamente. E em segundo lugar, é obviamente falsa. Há imensas coisas que não sabemos sobre mortes súbitas de bebés. Pode tratar-se de fatores ambientais que desconhecemos, e é muito provável que haja fatores genéticos que desconhecemos Assim, se há uma morte súbita na família, deve-se pô-la num grupo de risco. Provavelmente, tem esses fatores de risco ambientais e/ou fatores de risco genético que desconhecemos. E argumentar que há probabilidade de uma segunda morte sem conhecer essas informações, é idiota. É pior que idiota, é muito má ciência. Apesar disso, foi como foi apresentado, e no tribunal ninguém o contestou. Este é o primeiro problema. O segundo problema é: o que significa o número de um em 73 milhões? Depois de Sally Clark ser condenada — podem imaginar, deu grande estrilho na imprensa — um dos jornalistas de um dos mais respeitados jornais da Grã-Bretanha escreveu que o especialista tinha dito: "A hipótese de que ela estivesse inocente era de uma em 73 milhões". Isto é um erro de lógica. É exatamente o mesmo erro de lógica como o erro de pensar que, segundo o teste de doenças, que é 99% rigoroso, a hipótese de ter a doença é de 99%. No exemplo da doença, tivemos que ter em conta duas coisas, uma que era a possibilidade de o teste ter acertado ou não. E a outra era a possibilidade, a priori, de a pessoa ter a doença ou não. É exatamente o mesmo neste contexto. Há duas explicações envolvidas. Queremos saber quão prováveis, ou relativamente prováveis, são essas duas explicações diferentes. Uma delas é que Sally Clark estava inocente — o que, a priori, era esmagadoramente provável — a maior parte das mães não mata os filhos. A segunda parte da explicação é que ela sofrera um acontecimento extraordinariamente improvável. Não tão improvável como um em 73 milhões, mas muito improvável na mesma. A outra explicação é que ela era culpada. Podemos pensar a priori numa coisa improvável. — mas, no contexto de um julgamento criminal, temos que pensar que é improvável, dada a presunção de inocência. Mas, se ela tinha tentado matar os bebés, tinha conseguido. Portanto, a probabilidade de ela estar inocente não é de uma em 73 milhões. Não sabemos qual é. Tem a ver com a avaliação do peso das outras provas contra ela e das provas estatísticas. Sabemos que as crianças morreram. O que interessa é a probabilidade ou improbabilidade das duas explicações, uma em relação à outra. São ambas implausíveis. É uma situação em que os erros na estatística tiveram consequências profundas e extremamente infelizes. Há duas outras mulheres que foram condenadas com base em provas deste pediatra, que vieram a ser libertadas no apelo. Muitos casos foram revistos. E é especialmente atual, porque enfrenta hoje uma carga de descrédito no Conselho Médico Geral da Grã-Bretanha

So just to conclude -- what are the take-home messages from this? Well, we know that randomness and uncertainty and chance are very much a part of our everyday life. It's also true -- and, although, you, as a collective, are very special in many ways, you're completely typical in not getting the examples I gave right. It's very well documented that people get things wrong. They make errors of logic in reasoning with uncertainty. We can cope with the subtleties of language brilliantly -- and there are interesting evolutionary questions about how we got here. We are not good at reasoning with uncertainty. That's an issue in our everyday lives. As you've heard from many of the talks, statistics underpins an enormous amount of research in science -- in social science, in medicine and indeed, quite a lot of industry. All of quality control, which has had a major impact on industrial processing, is underpinned by statistics. It's something we're bad at doing. At the very least, we should recognize that, and we tend not to. To go back to the legal context, at the Sally Clark trial all of the lawyers just accepted what the expert said. So if a pediatrician had come out and said to a jury, "I know how to build bridges. I've built one down the road. Please drive your car home over it," they would have said, "Well, pediatricians don't know how to build bridges. That's what engineers do." On the other hand, he came out and effectively said, or implied, "I know how to reason with uncertainty. I know how to do statistics." And everyone said, "Well, that's fine. He's an expert." So we need to understand where our competence is and isn't. Exactly the same kinds of issues arose in the early days of DNA profiling, when scientists, and lawyers and in some cases judges, routinely misrepresented evidence. Usually -- one hopes -- innocently, but misrepresented evidence. Forensic scientists said, "The chance that this guy's innocent is one in three million." Even if you believe the number, just like the 73 million to one, that's not what it meant. And there have been celebrated appeal cases in Britain and elsewhere because of that.

Para concluir, que mensagens levar para casa a partir disto? Sabemos que o aleatório e a incerteza e o acaso fazem parte da nossa vida quotidiana. Também é verdade — embora vocês, enquanto coletivo, sejam muito especiais — que vocês são típicos em não seguir os exemplos que dei. Está bem documentado que as pessoas interpretam mal as coisas. Fazem erros de lógica ao raciocinar com incerteza. Lidamos lindamente com as subtilezas da linguagem e há questões evolucionárias sobre como chegámos lá. Mas não somos bons a raciocinar com a incerteza. É um problema da nossa vida quotidiana. Como ouviram em muitas palestras, a estatística sustenta muita investigação na ciência — ciência social, medicina, e em muitas coisas da indústria. O controlo de qualidade, que tem um importante impacto no fabrico industrial repousa na estatística. É uma coisa que fazemos mal. No mínimo, devíamos reconhecer isso, mas temos tendência para não o fazer. Para voltar ao contexto legal e ao julgamento de Sally Clark, todos os advogados aceitavam o que o especialista disse. Se aparecesse um pediatra que dissesse a um júri: "Sei construir pontes. Construí uma na estrada. "Por favor, usem-na para voltar para casa". teriam dito: "Os pediatras não sabem construir pontes. "Isso é trabalho de engenheiros". Por outro lado, ele apareceu e disse, ou sugeriu: "Sei raciocinar com a incerteza. Sei fazer estatísticas." Todos disseram: "Ótimo, Ele é especialista". Precisamos de perceber onde está ou não está a nossa competência. São exatamente as mesmas coisas que surgem hoje, no perfil do ADN, quando os cientistas e advogados e, nalguns casos, os juízes, interpretam mal as provas, de forma rotineira. Normalmente, é de forma inocente, mas interpretam mal as provas. Os cientistas forenses disseram: "A hipótese de este tipo estar inocente é de uma em três milhões". Mesmo a acreditar neste número — tal como os 73 milhões para um — não é isso o que eu quero dizer. Já houve recursos de apelos com êxito

And just to finish in the context of the legal system. It's all very well to say, "Let's do our best to present the evidence." But more and more, in cases of DNA profiling -- this is another one -- we expect juries, who are ordinary people -- and it's documented they're very bad at this -- we expect juries to be able to cope with the sorts of reasoning that goes on. In other spheres of life, if people argued -- well, except possibly for politics -- but in other spheres of life, if people argued illogically, we'd say that's not a good thing. We sort of expect it of politicians and don't hope for much more. In the case of uncertainty, we get it wrong all the time -- and at the very least, we should be aware of that, and ideally, we might try and do something about it. Thanks very much.

na Grã-Bretanha e não só por causa disso. Para acabar, no contexto do sistema legal, é muito bonito dizer: Vamos fazer o nosso melhor para apresentar provas", Mas, cada vez mais, no caso do perfil de ADN — este é outro — esperamos que os júris, que são pessoas vulgares — e está documentado que são muito maus nisto — esperamos que os júris consigam lidar com o tipo de raciocínio que se mantém. Noutras esferas da vida, as pessoas discutem — exceto talvez na política — noutras esferas da vida, se as pessoas discutem sem lógica diríamos que isso não é bom. Esperamos isso dos políticos e não esperamos muito mais. No caso da incerteza, estamos sempre a enganar-nos e, no mínimo, devíamos ter consciência disso. Devíamos fazer qualquer coisa quanto a isso.

Peter Donnelly: How juries are fooled by statistics

Peter Donnelly: How juries are fooled by statistics

Related talks

Hans Rosling: The best stats you've ever seen

Michael Shermer: Why people believe weird things

Emily Oster: Flip your thinking on AIDS in Africa

Robert Full: Learning from the gecko's tail

Aubrey de Grey: A roadmap to end aging

E.O. Wilson: Advice to a young scientist

Related talks

Hans Rosling: The best stats you've ever seen

Michael Shermer: Why people believe weird things

Emily Oster: Flip your thinking on AIDS in Africa

Robert Full: Learning from the gecko's tail

Aubrey de Grey: A roadmap to end aging

E.O. Wilson: Advice to a young scientist