Peter Donnelly: How juries are fooled by statistics

As other speakers have said, it's a rather daunting experience -- a particularly daunting experience -- to be speaking in front of this audience. But unlike the other speakers, I'm not going to tell you about the mysteries of the universe, or the wonders of evolution, or the really clever, innovative ways people are attacking the major inequalities in our world. Or even the challenges of nation-states in the modern global economy. My brief, as you've just heard, is to tell you about statistics -- and, to be more precise, to tell you some exciting things about statistics. And that's -- (Laughter) -- that's rather more challenging than all the speakers before me and all the ones coming after me. (Laughter) One of my senior colleagues told me, when I was a youngster in this profession, rather proudly, that statisticians were people who liked figures but didn't have the personality skills to become accountants. (Laughter) And there's another in-joke among statisticians, and that's, "How do you tell the introverted statistician from the extroverted statistician?" To which the answer is, "The extroverted statistician's the one who looks at the other person's shoes." (Laughter) But I want to tell you something useful -- and here it is, so concentrate now. This evening, there's a reception in the University's Museum of Natural History. And it's a wonderful setting, as I hope you'll find, and a great icon to the best of the Victorian tradition. It's very unlikely -- in this special setting, and this collection of people -- but you might just find yourself talking to someone you'd rather wish that you weren't. So here's what you do. When they say to you, "What do you do?" -- you say, "I'm a statistician." (Laughter) Well, except they've been pre-warned now, and they'll know you're making it up. And then one of two things will happen. They'll either discover their long-lost cousin in the other corner of the room and run over and talk to them. Or they'll suddenly become parched and/or hungry -- and often both -- and sprint off for a drink and some food. And you'll be left in peace to talk to the person you really want to talk to.

Como otros conferenciantes han dicho, es una experiencia bastante intimidante - una experiencia particularmente intimidante - hablar enfrente de esta audiencia. Pero a diferencia de los demás, yo no les voy a hablar acerca de los misterios del universo o de las maravillas de la evolución, o de las maneras ingeniosas, innovadoras, en que se están atacando las mayores desigualdades de nuestro mundo. O incluso de los retos de las naciones en la economía global moderna. Mi trabajo, como han oído, es hablarles de estadística -- y, para ser más preciso, contarles algunas cosas apasionantes acerca de ella. Y eso es -- (Risas) -- eso es bastante más difícil que lo que han hecho todos los conferenciantes antes de mí y todos los que vengan después. (Risas) Uno de mis colegas me dijo, cuando yo era un novato en esta profesión, orgullosamente, que los estadísticos eran personas a quienes les gustaban los números pero no tenían la personalidad para ser contables. (Risas) Y hay otro chiste entre estadísticos que es: “¿Cómo distingues al estadístico introvertido del estadístico extrovertido? Cuya respuesta es, “El estadístico extrovertido es el que mira los zapatos de la otra persona.” (Risas) Pero quiero decirles algo útil – y aquí está, así que concéntrense. Esta tarde, hay una recepción en el Museo de Historia Natural de la Universidad. Y es un lugar maravilloso, como espero que noten, y un gran icono de lo mejor de la tradición victoriana. Es muy improbable – en este lugar especial, con este grupo de gente -- pero puede ser que se encuentren hablando con alguien indeseable. Así que aquí está lo que deben hacer. Cuando les pregunten ”¿A qué se dedica?” Responden: “Soy estadístico”. (Risas) Bueno, excepto porque ahora están advertidos, y sabrán que se lo está inventando. Entonces una de dos cosas pasará. O descubrirán a un primo perdido en la otra esquina de la habitación y correrán a hablar con él, o de repente se sentirán sedientos y/o hambrientos – a menudo ambas-- y correrán a por un trago y algo de comida. Y a usted lo dejarán en paz para hablar con la persona con quien realmente quería.

It's one of the challenges in our profession to try and explain what we do. We're not top on people's lists for dinner party guests and conversations and so on. And it's something I've never really found a good way of doing. But my wife -- who was then my girlfriend -- managed it much better than I've ever been able to. Many years ago, when we first started going out, she was working for the BBC in Britain, and I was, at that stage, working in America. I was coming back to visit her. She told this to one of her colleagues, who said, "Well, what does your boyfriend do?" Sarah thought quite hard about the things I'd explained -- and she concentrated, in those days, on listening. (Laughter) Don't tell her I said that. And she was thinking about the work I did developing mathematical models for understanding evolution and modern genetics. So when her colleague said, "What does he do?" She paused and said, "He models things." (Laughter) Well, her colleague suddenly got much more interested than I had any right to expect and went on and said, "What does he model?" Well, Sarah thought a little bit more about my work and said, "Genes." (Laughter) "He models genes."

Es uno de los retos de nuestra profesión intentar explicar lo que hacemos. No estamos en la cima de las listas de invitados a cenar, a charlar y a cosas así. Y es algo que nunca he encontrado cómo hacer. Pero mi esposa – entonces mi novia - lo logró mucho mejor de lo que yo jamás he podido. Hace muchos años, cuando empezábamos a salir, ella trabajaba para la BBC en Gran Bretaña, y yo estaba, en ese momento, trabajando en Estados Unidos. Yo estaba de vuelta visitándola. Le dijo lo siguiente a una de sus compañeras de trabajo, quien preguntó: “¿Bien, qué es lo que hace tu novio?” Sarah pensó bastante acerca de las cosas que yo le había explicado - y se concentró, aquellos días, en escuchar. (Risas) No le digan que dije eso. Y estaba pensando en el trabajo que yo hacía desarrollando modelos matemáticos para comprender la evolución y la genética modernas. Así que cuando su compañera le preguntó: “¿Qué hace?” Ella hizo una pausa y dijo: “Modela cosas”. (Risas) Bueno, su compañera súbitamente se interesó mucho más de lo que cabía esperar y siguió preguntando: “¿Qué es lo que modela?” Bien, Sarah pensó un poco más acerca de mi trabajo y dijo: “Genes”. (Risa) "Modela genes”.

That is my first love, and that's what I'll tell you a little bit about. What I want to do more generally is to get you thinking about the place of uncertainty and randomness and chance in our world, and how we react to that, and how well we do or don't think about it. So you've had a pretty easy time up till now -- a few laughs, and all that kind of thing -- in the talks to date. You've got to think, and I'm going to ask you some questions. So here's the scene for the first question I'm going to ask you. Can you imagine tossing a coin successively? And for some reason -- which shall remain rather vague -- we're interested in a particular pattern. Here's one -- a head, followed by a tail, followed by a tail.

Este es mi primer amor, y de ello les voy a hablar un poco. Lo que sobre todo quiero conseguir es que piensen acerca del lugar que la incertidumbre, el azar y la probabilidad ocupan en nuestro mundo, y cómo reaccionamos frente a ello, y que tan bien razonamos o no respecto a esto. Así que lo han tenido bastante fácil hasta ahora -- algunas risas y cosas por el estilo – en las conferencias hasta ahora. Tienen que pensar, y voy a hacerles algunas preguntas. Así que aquí está el escenario de la primera pregunta que tengo: ¿Pueden imaginarse lanzando una moneda sucesivamente? Y por alguna razón – la cual tendrá que quedar sin precisar -- estamos interesados en un patrón en particular. Aquí hay uno: cara, seguida de cruz, seguida de otra cruz.

So suppose we toss a coin repeatedly. Then the pattern, head-tail-tail, that we've suddenly become fixated with happens here. And you can count: one, two, three, four, five, six, seven, eight, nine, 10 -- it happens after the 10th toss. So you might think there are more interesting things to do, but humor me for the moment. Imagine this half of the audience each get out coins, and they toss them until they first see the pattern head-tail-tail. The first time they do it, maybe it happens after the 10th toss, as here. The second time, maybe it's after the fourth toss. The next time, after the 15th toss. So you do that lots and lots of times, and you average those numbers. That's what I want this side to think about.

Así que supongan que lanzamos una moneda repetidamente. Entonces la pauta cara-cruz-cruz, con la que nos hemos obsesionado, aparece. Y pueden contar: uno, dos, tres, cuatro, cinco, seis, siete, ocho, nueve, diez -- ocurre después del décimo lanzamiento. Deben de pensar que hay cosas más interesantes que hacer, pero síganme la corriente un momento. Imagínense que cada uno en este lado de la audiencia saca una moneda y la lanza hasta que logra el patrón cara-cruz-cruz. La primera vez que lo hacen, tal vez pase después del décimo lanzamiento, como aquí. La segunda vez, tal vez sea después del cuarto. Y la próxima, después del decimoquinto. Así que lanzan la moneda muchas, muchas veces, y calculan la media de esos números. En eso es en lo que quiero que este lado piense.

The other half of the audience doesn't like head-tail-tail -- they think, for deep cultural reasons, that's boring -- and they're much more interested in a different pattern -- head-tail-head. So, on this side, you get out your coins, and you toss and toss and toss. And you count the number of times until the pattern head-tail-head appears and you average them. OK? So on this side, you've got a number -- you've done it lots of times, so you get it accurately -- which is the average number of tosses until head-tail-tail. On this side, you've got a number -- the average number of tosses until head-tail-head.

El otro lado de la audiencia no quiere cara-cruz-cruz -- piensan que, por razones culturales, es aburrido -- y están mucho más interesados en otro patrón – cara-cruz-cara. Así que en este lado, sacan sus monedas, y las lanzan y lanzan y lanzan. Y cuentan los lanzamientos hasta que el patrón cara-cruz-cara aparece y sacan la media. ¿De acuerdo? Así que en este lado, tienen un número -- lo han hecho muchas veces, así que el número es preciso – que es el número promedio de volados hasta conseguir cara-cruz-cruz. En este lado, tienen otro número – el número promedio de lanzamientos hasta conseguir cara-cruz-cara.

So here's a deep mathematical fact -- if you've got two numbers, one of three things must be true. Either they're the same, or this one's bigger than this one, or this one's bigger than that one. So what's going on here? So you've all got to think about this, and you've all got to vote -- and we're not moving on. And I don't want to end up in the two-minute silence to give you more time to think about it, until everyone's expressed a view. OK. So what you want to do is compare the average number of tosses until we first see head-tail-head with the average number of tosses until we first see head-tail-tail.

Aquí encontramos un hecho matemático profundo -- si tienes dos números, una de tres cosas tiene que ocurrir. O son iguales, o uno es más grande que el otro o viceversa. ¿Así que qué está pasando aquí? Todos ustedes tienen que pensarlo bien, y todos deben votar -- si no, no continuaremos. Y no quiero terminar con un silencio de dos minutos para darles tiempo para pensarlo, hasta que todos expresen una opinión. Vale. Lo que tienen que hacer es comparar el número promedio de lanzamientos hasta que conseguimos cara-cruz-cara con el número promedio de lanzamientos hasta que conseguimos cara-cruz-cruz.

Who thinks that A is true -- that, on average, it'll take longer to see head-tail-head than head-tail-tail? Who thinks that B is true -- that on average, they're the same? Who thinks that C is true -- that, on average, it'll take less time to see head-tail-head than head-tail-tail? OK, who hasn't voted yet? Because that's really naughty -- I said you had to. (Laughter) OK. So most people think B is true. And you might be relieved to know even rather distinguished mathematicians think that. It's not. A is true here. It takes longer, on average. In fact, the average number of tosses till head-tail-head is 10 and the average number of tosses until head-tail-tail is eight. How could that be? Anything different about the two patterns? There is. Head-tail-head overlaps itself. If you went head-tail-head-tail-head, you can cunningly get two occurrences of the pattern in only five tosses. You can't do that with head-tail-tail. That turns out to be important.

¿Quién cree que A es verdad -- que en promedio, tomará más tiempo ver cara-cruz-cara que cara-cruz-cruz? ¿Quién cree que B es verdad – que el promedio es igual? ¿Quién cree que C es verdad – que en promedio, tomará menos tiempo ver cara-cruz-cara que cara-cruz-cruz? Vale, ¿quién queda por votar? Porque eso es una travesura – dije que tenían que votar. (Risas) Vale. La mayoría cree que B es verdad. Y les tranquilizará saber que incluso matemáticos bastante distinguidos piensan lo mismo. Pero no lo es. A es verdad. Tarda más tiempo, de media. De hecho, el número promedio de lanzamientos hasta cara-cruz-cara es 10, y la media hasta cara-cruz-cruz es 8. ¿Cómo puede ser esto? ¿Hay alguna diferencia entre los dos patrones? La hay. Cara-cruz-cara se solapa. Si buscas cara-cruz-cara, con suerte puedes conseguir dos secuencias del patrón en cinco lanzamientos. Eso no lo puedes hacer con cara-cruz-cruz. Y eso resulta ser importante.

There are two ways of thinking about this. I'll give you one of them. So imagine -- let's suppose we're doing it. On this side -- remember, you're excited about head-tail-tail; you're excited about head-tail-head. We start tossing a coin, and we get a head -- and you start sitting on the edge of your seat because something great and wonderful, or awesome, might be about to happen. The next toss is a tail -- you get really excited. The champagne's on ice just next to you; you've got the glasses chilled to celebrate. You're waiting with bated breath for the final toss. And if it comes down a head, that's great. You're done, and you celebrate. If it's a tail -- well, rather disappointedly, you put the glasses away and put the champagne back. And you keep tossing, to wait for the next head, to get excited.

Hay dos maneras de pensar acerca de esto. Les mostraré una de ellas. Imaginen – supongamos que lo estamos haciendo. De este lado – recuerden, están entusiasmados con cara-cruz-cruz -- y ustedes están entusiasmados con cara-cruz-cara. Lanzamos la moneda, y sale cara -- y están al borde de su asiento porque algo grandioso y maravilloso, o increíble, puede estar a punto de suceder. El siguiente lanzamiento sale cruz — están realmente entusiasmados. El champán está metido en el hielo a su lado, y tienen las copas heladas listas para celebrar. Esperan con el corazón en la mano el último lanzamiento. Y si sale cara, es grandioso. Lo lograron, y lo celebran. Si sale cruz – bueno, es decepcionante, guardan las copas y ponen el champán en su lugar. Y siguen lanzando, esperando la siguiente cara, para entusiasmarse.

On this side, there's a different experience. It's the same for the first two parts of the sequence. You're a little bit excited with the first head -- you get rather more excited with the next tail. Then you toss the coin. If it's a tail, you crack open the champagne. If it's a head you're disappointed, but you're still a third of the way to your pattern again. And that's an informal way of presenting it -- that's why there's a difference. Another way of thinking about it -- if we tossed a coin eight million times, then we'd expect a million head-tail-heads and a million head-tail-tails -- but the head-tail-heads could occur in clumps. So if you want to put a million things down amongst eight million positions and you can have some of them overlapping, the clumps will be further apart. It's another way of getting the intuition.

En este lado, la experiencia es diferente. Es igual durante las primeras dos partes de la secuencia. Están un poco entusiasmados con la primera cara -- y bastante más con la siguiente cruz. Entonces lanzan la moneda. Si es cruz, abren el champán. Si es cara, están algo decepcionados, pero ya tienen una tercera parte de su patrón. Y esa es una manera informal de presentarlo – pero esa es la diferencia. Otra manera de verlo - si lanzamos la moneda ocho millones de veces, esperaríamos un millón de cara-cruz-cara y un millón de cara-cruz-cruz – pero las cara-cruz-cruz podrían ocurrir en conjunto. Así que si ponen un millón de cosas entre ocho millones de posiciones y pueden permitir algo de traslape, los conjuntos estarán más lejos entre sí. Esa es otra manera de intuirlo.

What's the point I want to make? It's a very, very simple example, an easily stated question in probability, which every -- you're in good company -- everybody gets wrong. This is my little diversion into my real passion, which is genetics. There's a connection between head-tail-heads and head-tail-tails in genetics, and it's the following. When you toss a coin, you get a sequence of heads and tails. When you look at DNA, there's a sequence of not two things -- heads and tails -- but four letters -- As, Gs, Cs and Ts. And there are little chemical scissors, called restriction enzymes which cut DNA whenever they see particular patterns. And they're an enormously useful tool in modern molecular biology. And instead of asking the question, "How long until I see a head-tail-head?" -- you can ask, "How big will the chunks be when I use a restriction enzyme which cuts whenever it sees G-A-A-G, for example? How long will those chunks be?"

¿Qué es lo que quiero decir? Es un ejemplo muy, muy simple, una pregunta sencilla de probabilidad, que absolutamente – y están en buena compañía – todos responden mal. Este es mi pequeño entretenimiento en relación con mi pasión verdadera, que es la genética. Hay una relación entre cara-cruz-cara y cara-cruz-cruz en la genética, y es la siguiente. Cuando lanzas una moneda, obtienes una secuencia de caras y cruces. Cuando vemos el ADN, hay una secuencia de no solo dos cosas – caras y cruces -- sino de cuatro letras — A, G, C y T. Y hay unas pequeñas tijeras químicas, llamadas enzimas de restricción que cortan el ADN cuando ven un cierto patrón. Y son una herramienta enormemente útil en la biología molecular moderna. Y en vez de preguntar: “¿Cuántos lanzamientos hasta conseguir cara-cruz-cara?” -- podemos preguntar: “¿Cómo de grandes serán los pedazos cuando uso una enzima de restricción que corta cuando ve, por ejemplo, G-A-A-G?” ¿Cómo de largos serán esos pedazos?

That's a rather trivial connection between probability and genetics. There's a much deeper connection, which I don't have time to go into and that is that modern genetics is a really exciting area of science. And we'll hear some talks later in the conference specifically about that. But it turns out that unlocking the secrets in the information generated by modern experimental technologies, a key part of that has to do with fairly sophisticated -- you'll be relieved to know that I do something useful in my day job, rather more sophisticated than the head-tail-head story -- but quite sophisticated computer modelings and mathematical modelings and modern statistical techniques. And I will give you two little snippets -- two examples -- of projects we're involved in in my group in Oxford, both of which I think are rather exciting. You know about the Human Genome Project. That was a project which aimed to read one copy of the human genome. The natural thing to do after you've done that -- and that's what this project, the International HapMap Project, which is a collaboration between labs in five or six different countries. Think of the Human Genome Project as learning what we've got in common, and the HapMap Project is trying to understand where there are differences between different people.

Esa es una conexión algo trivial entre la probabilidad y la genética. Hay una conexión mucho más profunda, pero no tengo tiempo para explorarla y es que la genética moderna es un área realmente excitante de la ciencia. Y oiremos algunas charlas más tarde específicamente acerca de esto. Pero resulta que descubriendo los secretos en la información producida por tecnologías experimentales modernas, una parte clave tiene que ver con bastante sofisticados -- estarán felices de saber que hago algo útil en mi trabajo diario, más sofisticado que la historia de la cara-cruz-cruz -- con modelos computacionales y matemáticos bastante sofisticados y técnicas estadísticas modernas. Y les voy a dar dos pequeños fragmentos – dos ejemplos -- de proyectos que lleva mi grupo en Oxford, los cuales creo que son bastante apasionantes. Ustedes han oído hablar acerca del Proyecto del Genoma Humano. Fue un proyecto que intentaba descifrar una copia del genoma humano. Lo que sigue naturalmente después de lograrlo -- es este otro proyecto, el Proyecto Internacional HapMap, el cual es una colaboración entre laboratorios de cinco o seis países. Piensen que en el Proyecto del Genoma Humano se trata de aprender qué tenemos en común, y el proyecto HapMap intenta entender dónde están las diferencias entre las distintas personas.

Why do we care about that? Well, there are lots of reasons. The most pressing one is that we want to understand how some differences make some people susceptible to one disease -- type-2 diabetes, for example -- and other differences make people more susceptible to heart disease, or stroke, or autism and so on. That's one big project. There's a second big project, recently funded by the Wellcome Trust in this country, involving very large studies -- thousands of individuals, with each of eight different diseases, common diseases like type-1 and type-2 diabetes, and coronary heart disease, bipolar disease and so on -- to try and understand the genetics. To try and understand what it is about genetic differences that causes the diseases. Why do we want to do that? Because we understand very little about most human diseases. We don't know what causes them. And if we can get in at the bottom and understand the genetics, we'll have a window on the way the disease works, and a whole new way about thinking about disease therapies and preventative treatment and so on. So that's, as I said, the little diversion on my main love.

¿Por qué nos importa? Bueno, hay muchas razones. La más urgente es que queremos entender cómo es que algunas diferencias hacen a algunas personas propensas a cierta enfermedad –como la diabetes del tipo 2 -- y otras diferencias hacen a ciertas personas más propensas a las enfermedades cardíacas, o a las apoplejías, o al autismo, etcétera. Ese es un gran proyecto. Hay otro gran proyecto, recientemente financiado por el Wellcome Trust en este país, involucrando grandes estudios – miles de individuos, con ocho enfermedades diferentes, enfermedades comunes como la diabetes del tipo 1 y 2, enfermedades coronarias, trastorno bipolar, y otras – para intentar entender la genética. Para intentar entender qué diferencias genéticas y por qué causan las enfermedades. ¿Por qué queremos hacerlo? Porque entendemos muy poco acerca de la mayoría de las enfermedades humanas. No sabemos qué las causa. Y si podemos llegar al fondo y entender la genética, tendremos una ventana al modo de actuación de la enfermedad. Y una manera completamente nueva de ver las terapias y el tratamiento preventivo y todo lo demás. Así que ese es, como dije, el pequeño entretenimiento dentro de mi verdadero amor.

Back to some of the more mundane issues of thinking about uncertainty. Here's another quiz for you -- now suppose we've got a test for a disease which isn't infallible, but it's pretty good. It gets it right 99 percent of the time. And I take one of you, or I take someone off the street, and I test them for the disease in question. Let's suppose there's a test for HIV -- the virus that causes AIDS -- and the test says the person has the disease. What's the chance that they do? The test gets it right 99 percent of the time. So a natural answer is 99 percent. Who likes that answer? Come on -- everyone's got to get involved. Don't think you don't trust me anymore. (Laughter) Well, you're right to be a bit skeptical, because that's not the answer. That's what you might think. It's not the answer, and it's not because it's only part of the story. It actually depends on how common or how rare the disease is. So let me try and illustrate that. Here's a little caricature of a million individuals. So let's think about a disease that affects -- it's pretty rare, it affects one person in 10,000. Amongst these million individuals, most of them are healthy and some of them will have the disease. And in fact, if this is the prevalence of the disease, about 100 will have the disease and the rest won't. So now suppose we test them all. What happens? Well, amongst the 100 who do have the disease, the test will get it right 99 percent of the time, and 99 will test positive. Amongst all these other people who don't have the disease, the test will get it right 99 percent of the time. It'll only get it wrong one percent of the time. But there are so many of them that there'll be an enormous number of false positives. Put that another way -- of all of them who test positive -- so here they are, the individuals involved -- less than one in 100 actually have the disease. So even though we think the test is accurate, the important part of the story is there's another bit of information we need.

De vuelta a consideraciones más mundanas sobre nuestro razonamiento con la incertidumbre. Aquí hay otro acertijo para ustedes -- supongan que tenemos una prueba para detectar una enfermedad. No es infalible, pero es bastante buena. Acierta el 99% de las veces. Y tomo a uno de ustedes o a alguien al azar en la calle, y les hago esta prueba de la enfermedad. Supongamos que es una prueba para el VIH – el virus que causa el SIDA -- y la prueba dice que la persona está enferma. ¿Cuál es la probabilidad de que la tenga? La prueba acierta el 99% de las veces. Así que la respuesta natural es 99%. ¿A quién le gusta esa respuesta? Vamos – todos tenemos que participar. No crean que ya no pueden confiar en mí. (Risas) Bien, está bien que se sientan un poco escépticos, porque esa no es la respuesta. Es lo que podrían pensar. No es la respuesta, y no lo es porque es sólo una parte de la historia. En realidad, depende de que tan común sea la enfermedad. Déjenme intentar mostrárselo. Tenemos una muestra de un millón de individuos. Así que pensemos en una enfermedad que afecta -- es bastante rara, afecta a una persona de cada 10,000. En este millón de individuos, la mayoría están sanos y algunos tendrán la enfermedad. De hecho, si esta es la frecuencia de la enfermedad, alrededor de 100 tendrán la enfermedad y el resto no. Ahora supongamos que le hacemos la prueba a todos. ¿Qué ocurre? Bueno, entre los 100 que tienen la enfermedad, la prueba acertará el 99% de las veces, y 99 saldrán positivos. Entre todas las personas que no tienen la enfermedad, la prueba acertará el 99% de las veces. Solamente se equivocará un 1% de veces. Pero hay tantos que habrá un número enorme de falsos positivos. Poniéndolo de otra manera -- de todo aquel que resulte positivo – aquí están, los individuos involucrados -- menos de uno de cada 100 tendrá realmente la enfermedad. Así que aún si pensamos que la prueba es precisa, la parte importante de la historia es que hay otra información necesaria.

Here's the key intuition. What we have to do, once we know the test is positive, is to weigh up the plausibility, or the likelihood, of two competing explanations. Each of those explanations has a likely bit and an unlikely bit. One explanation is that the person doesn't have the disease -- that's overwhelmingly likely, if you pick someone at random -- but the test gets it wrong, which is unlikely. The other explanation is that the person does have the disease -- that's unlikely -- but the test gets it right, which is likely. And the number we end up with -- that number which is a little bit less than one in 100 -- is to do with how likely one of those explanations is relative to the other. Each of them taken together is unlikely.

Esta es la intuición clave. Lo que tenemos que hacer, una vez que sabemos que la prueba es positiva es considerar la plausibilidad, o probabilidad, de dos explicaciones que compiten. Cada una de esas explicaciones tiene una parte probable y una parte improbable. Una explicación es que la persona no tiene la enfermedad -- lo que es abrumadoramente probable, si tomas a alguien al azar -- pero la prueba se equivoca, lo que es improbable. La otra explicación es que la persona está enferma – lo que es improbable -- pero la prueba es correcta, lo que es probable. Y el número con el que nos encontramos -- ese número que es un poco menos del 1% -- tiene que ver con qué tan probable una de esas explicaciones es relativa a la otra. Cada una en conjunto es improbable.

Here's a more topical example of exactly the same thing. Those of you in Britain will know about what's become rather a celebrated case of a woman called Sally Clark, who had two babies who died suddenly. And initially, it was thought that they died of what's known informally as "cot death," and more formally as "Sudden Infant Death Syndrome." For various reasons, she was later charged with murder. And at the trial, her trial, a very distinguished pediatrician gave evidence that the chance of two cot deaths, innocent deaths, in a family like hers -- which was professional and non-smoking -- was one in 73 million. To cut a long story short, she was convicted at the time. Later, and fairly recently, acquitted on appeal -- in fact, on the second appeal. And just to set it in context, you can imagine how awful it is for someone to have lost one child, and then two, if they're innocent, to be convicted of murdering them. To be put through the stress of the trial, convicted of murdering them -- and to spend time in a women's prison, where all the other prisoners think you killed your children -- is a really awful thing to happen to someone. And it happened in large part here because the expert got the statistics horribly wrong, in two different ways.

Aquí hay un ejemplo de más actualidad que trata exactamente de lo mismo. Quienes sean de Gran Bretaña sabrán acerca de un caso que se ha hecho bastante famoso de una mujer llamada Sally Clark, que tuvo dos bebés que murieron súbitamente. Inicialmente, se pensó que murieron de lo que informalmente se llama “muerte en la cuna”, y más formalmente Síndrome de Muerte Súbita Infantil. Por varias razones, se la acusó de asesinato. Y en el juicio, su juicio, un pediatra muy distinguido aportó evidencia de que la probabilidad de dos muertes en la cuna, muertes inocentes, en una familia como la suya -- profesional y no fumadora – era de una en 73 millones. En resumen, fue condenada en esa ocasión. Después, recientemente, fue declarada inocente al apelar – de hecho, en su segunda apelación. Y solo para ponerlo en contexto, pueden imaginar qué horrible es para alguien si es inocente, perder un hijo, y luego otro, y ser condenada por asesinarlos. Soportar todo el estrés del juicio, de la condena por asesinarlos -- y pasar tiempo en una prisión de mujeres, donde todas las demás prisioneras creen que mataste a tus hijos – es algo tremendamente horrible para una persona. Y pasó en gran parte porque el experto se equivocó terriblemente en las estadísticas, de dos maneras.

So where did he get the one in 73 million number? He looked at some research, which said the chance of one cot death in a family like Sally Clark's is about one in 8,500. So he said, "I'll assume that if you have one cot death in a family, the chance of a second child dying from cot death aren't changed." So that's what statisticians would call an assumption of independence. It's like saying, "If you toss a coin and get a head the first time, that won't affect the chance of getting a head the second time." So if you toss a coin twice, the chance of getting a head twice are a half -- that's the chance the first time -- times a half -- the chance a second time. So he said, "Here, I'll assume that these events are independent. When you multiply 8,500 together twice, you get about 73 million." And none of this was stated to the court as an assumption or presented to the jury that way. Unfortunately here -- and, really, regrettably -- first of all, in a situation like this you'd have to verify it empirically. And secondly, it's palpably false. There are lots and lots of things that we don't know about sudden infant deaths. It might well be that there are environmental factors that we're not aware of, and it's pretty likely to be the case that there are genetic factors we're not aware of. So if a family suffers from one cot death, you'd put them in a high-risk group. They've probably got these environmental risk factors and/or genetic risk factors we don't know about. And to argue, then, that the chance of a second death is as if you didn't know that information is really silly. It's worse than silly -- it's really bad science. Nonetheless, that's how it was presented, and at trial nobody even argued it. That's the first problem. The second problem is, what does the number of one in 73 million mean? So after Sally Clark was convicted -- you can imagine, it made rather a splash in the press -- one of the journalists from one of Britain's more reputable newspapers wrote that what the expert had said was, "The chance that she was innocent was one in 73 million." Now, that's a logical error. It's exactly the same logical error as the logical error of thinking that after the disease test, which is 99 percent accurate, the chance of having the disease is 99 percent. In the disease example, we had to bear in mind two things, one of which was the possibility that the test got it right or not. And the other one was the chance, a priori, that the person had the disease or not. It's exactly the same in this context. There are two things involved -- two parts to the explanation. We want to know how likely, or relatively how likely, two different explanations are. One of them is that Sally Clark was innocent -- which is, a priori, overwhelmingly likely -- most mothers don't kill their children. And the second part of the explanation is that she suffered an incredibly unlikely event. Not as unlikely as one in 73 million, but nonetheless rather unlikely. The other explanation is that she was guilty. Now, we probably think a priori that's unlikely. And we certainly should think in the context of a criminal trial that that's unlikely, because of the presumption of innocence. And then if she were trying to kill the children, she succeeded. So the chance that she's innocent isn't one in 73 million. We don't know what it is. It has to do with weighing up the strength of the other evidence against her and the statistical evidence. We know the children died. What matters is how likely or unlikely, relative to each other, the two explanations are. And they're both implausible. There's a situation where errors in statistics had really profound and really unfortunate consequences. In fact, there are two other women who were convicted on the basis of the evidence of this pediatrician, who have subsequently been released on appeal. Many cases were reviewed. And it's particularly topical because he's currently facing a disrepute charge at Britain's General Medical Council.

Así que, ¿de dónde sacó el número de uno en 73 millones? Consultó algunas investigaciones que decían que la probabilidad de una muerte en cuna en una familia como la de Sally Clark es aproximadamente de una en 8500. Así que dijo “Supondré que si hay una muerte en la cuna en la familia, las probabilidades de una segunda muerte no cambian”. Eso es lo que los estadísticos llamarían una suposición de independencia. Es como decir, “Si lanzas una moneda y sale cara a la primera, no afectará a la posibilidad de sacar cara la segunda vez”. Así que si lanzas una moneda dos veces, las posibilidades de sacar cara dos veces es la mitad -- eso es la posibilidad de la primera vez – por la mitad – la posibilidad de la segunda vez. Así que dijo, “Asumamos -- Asumiré que estos eventos son independientes. Cuando multiplicas 8500 por 8500, da cerca de 73 millones.” Y nada de esto se expuso al tribunal como una suposición o se presentó al jurado de esa manera. Desafortunadamente – y de verdad, lamentablemente -- primero que nada, en esta situación tendrías que verificarlo empíricamente. Y segundo, es palpablemente falso. Hay muchas cosas que no sabemos con respecto a la muerte súbita infantil. Puede que afecten factores ambientales que no conocemos, y es muy probable que también haya factores genéticos que no conocemos. Así que si una familia sufre una muerte en la cuna, la pondrías en un grupo de alto riesgo. Probablemente tengan factores de riesgo ambientales y/o genéticos que no conocemos. Y argumentar, entonces, que la posibilidad de una segunda muerte es igual que si desconocieras esa información, es realmente estúpido. Es peor que estúpido – es ciencia realmente mala. Sin embargo, así fue presentada, y en el juicio ni siquiera nadie lo discutió. Ese es el primer problema. El segundo problema es, ¿qué significa el número de uno en 73 millones? Así que después de que Sally Clark fuera condenada -- pueden imaginar, el efecto notable en la prensa -- uno de los periodistas de uno de los periódicos más respetables de Gran Bretaña escribió que lo que el experto dijo fue: “La probabilidad de que sea inocente es de una en 73 millones.” Ese es un error lógico. Es exactamente el mismo error lógico que el error de pensar que después de la prueba de la enfermedad, que es un 99% precisa, la posibilidad de tener la enfermedad es del 99%. En el ejemplo de la enfermedad, teníamos que tener en cuenta dos cosas, una de las cuales era la posibilidad de que la prueba se equivocara. Y otra era la posibilidad, a priori, de que la persona tuviera la enfermedad o no. Es exactamente igual en este contexto. Hay dos cosas involucradas – dos partes de la explicación. Queremos saber qué tan probables, o qué tan relativamente probables, son dos explicaciones. Una de ellas es que Sally Clark era inocente -- que es, a priori, abrumadoramente posible -- la mayoría de las madres no matan a sus hijos. Y la segunda parte de la explicación es que le pasó algo increíblemente improbable. No tan improbable como uno en 73 millones, pero muy improbable de todas formas. La otra explicación es que era culpable. Probablemente pensamos a priori que es improbable. Y así deberíamos pensar en el contexto de un juicio criminal, que es improbable, debido a la presunción de inocencia. Y entonces si ella hubiera intentado matar a sus hijos, lo logró. Así que la probabilidad de que sea inocente no es una en 73 millones. No sabemos cual es. Tiene que ver con sopesar la contundencia de la evidencia en su contra y la evidencia estadística. Sabemos que los niños murieron. Lo que importa es que tan probable o improbable, relativamente son las dos explicaciones. Ambas son inverosímiles. Esta es una situación donde los errores en la estadística tuvieron profundas y verdaderamente desafortunadas consecuencias. De hecho, otras dos mujeres fueron condenadas en función de la evidencia de dicho pediatra, y han sido posteriormente liberadas al apelar. Muchos casos fueron revisados. Y es de un verdadero interés actual porque ahora mismo está bajo cargos de descrédito en el Consejo Médico General de Gran Bretaña.

So just to conclude -- what are the take-home messages from this? Well, we know that randomness and uncertainty and chance are very much a part of our everyday life. It's also true -- and, although, you, as a collective, are very special in many ways, you're completely typical in not getting the examples I gave right. It's very well documented that people get things wrong. They make errors of logic in reasoning with uncertainty. We can cope with the subtleties of language brilliantly -- and there are interesting evolutionary questions about how we got here. We are not good at reasoning with uncertainty. That's an issue in our everyday lives. As you've heard from many of the talks, statistics underpins an enormous amount of research in science -- in social science, in medicine and indeed, quite a lot of industry. All of quality control, which has had a major impact on industrial processing, is underpinned by statistics. It's something we're bad at doing. At the very least, we should recognize that, and we tend not to. To go back to the legal context, at the Sally Clark trial all of the lawyers just accepted what the expert said. So if a pediatrician had come out and said to a jury, "I know how to build bridges. I've built one down the road. Please drive your car home over it," they would have said, "Well, pediatricians don't know how to build bridges. That's what engineers do." On the other hand, he came out and effectively said, or implied, "I know how to reason with uncertainty. I know how to do statistics." And everyone said, "Well, that's fine. He's an expert." So we need to understand where our competence is and isn't. Exactly the same kinds of issues arose in the early days of DNA profiling, when scientists, and lawyers and in some cases judges, routinely misrepresented evidence. Usually -- one hopes -- innocently, but misrepresented evidence. Forensic scientists said, "The chance that this guy's innocent is one in three million." Even if you believe the number, just like the 73 million to one, that's not what it meant. And there have been celebrated appeal cases in Britain and elsewhere because of that.

Así que para concluir -- ¿Cuál es el mensaje de todo esto? Bien, sabemos que el azar, la probabilidad y la incertidumbre son parte integral de nuestra vida diaria. Es también verdad – y aunque ustedes, como colectivo, son muy especiales de muchas maneras, son completamente típicos en no acertar en los ejemplos que mencioné. Está muy bien documentado el que la gente se equivoca en estas cosas. Cometen errores de lógica al razonar con la incertidumbre. Podemos trabajar con las sutilezas del idioma brillantemente -- y hay preguntas sobre cómo evolucionamos hasta lograrlo muy interesantes. No somos buenos razonando con la incertidumbre. Ese es un problema en nuestra vida diaria. Como han oído en muchas de las charlas, la estadística está en la base de una gran cantidad de la investigación científica – en ciencias sociales, medicina, y de hecho, en buena parte de la industria. Todo ese control de calidad, que tiene un gran impacto en el proceso industrial, está basado en la estadística. Es algo que hacemos mal. Al menos, deberíamos reconocerlo, pero tendemos a no hacerlo. Volviendo al contexto legal, en el caso del juicio de Sally Clark, todos los abogados simplemente aceptaron las palabras del experto. Así que si un pediatra le hubiera dicho a un jurado, “Sé como construir puentes. Construí uno en esa calle. Por favor crúcelo con su automóvil”, el jurado habría dicho: “Los pediatras no saben construir puentes. Eso les correponde a los ingenieros.” Por otra parte, llegó y efectivamente dijo, o dió a entender, “Sé como razonar con la incertidumbre. Sé trabajar con la estadística.” Y todos dijeron: “Está bien. Es un experto”. Así que tenemos que entender dónde acaban nuestras competencias. Exactamente este tipo de cuestiones aparecieron cuando se empezaba a secuenciar el ADN, cuando los científicos, los abogados y a veces incluso los jueces, tergiversaron las pruebas sistemáticamente . Normalmente – uno espera – que inocentemente, pero tergiversaron las pruebas. Los científicos forenses dijeron, “La posibilidad de que este tipo sea inocente es una en 3 millones”. Aun si te crees el número, igual que el de uno en 73 millones, no es eso lo que significa. Y ha habido apelaciones famosas en Gran Bretaña y en otros lugares debido a eso.

And just to finish in the context of the legal system. It's all very well to say, "Let's do our best to present the evidence." But more and more, in cases of DNA profiling -- this is another one -- we expect juries, who are ordinary people -- and it's documented they're very bad at this -- we expect juries to be able to cope with the sorts of reasoning that goes on. In other spheres of life, if people argued -- well, except possibly for politics -- but in other spheres of life, if people argued illogically, we'd say that's not a good thing. We sort of expect it of politicians and don't hope for much more. In the case of uncertainty, we get it wrong all the time -- and at the very least, we should be aware of that, and ideally, we might try and do something about it. Thanks very much.

Y para terminar con el contexto del sistema legal. Está muy bien decir “Hagamos lo mejor para presentar las pruebas”. Pero cada vez más, en los casos de análisis de ADN – este es otro -- esperamos que los jurados, que son gente corriente -- y estando documentado que son muy malos en esto -- esperamos que los jurados sean capaces de trabajar con este tipo de razonamiento. En otras esferas de la vida, si la gente argumentara – bueno, con la excepción de la política. Pero en otras esferas de la vida, si la gente argumentara ilógicamente, diríamos que no es bueno. Lo esperamos de los políticos y no esperamos mucho más. En el caso de la incertidumbre, siempre nos estamos equivocando -- y al menos, deberíamos ser conscientes de ello. E idealmente, tal vez deberíamos intentar hacer algo al respecto. Muchas gracias.

Peter Donnelly: How juries are fooled by statistics

Peter Donnelly: How juries are fooled by statistics

Related talks

Hans Rosling: The best stats you've ever seen

Michael Shermer: Why people believe weird things

Emily Oster: Flip your thinking on AIDS in Africa

Robert Full: Learning from the gecko's tail

Aubrey de Grey: A roadmap to end aging

E.O. Wilson: Advice to a young scientist

Related talks

Hans Rosling: The best stats you've ever seen

Michael Shermer: Why people believe weird things

Emily Oster: Flip your thinking on AIDS in Africa

Robert Full: Learning from the gecko's tail

Aubrey de Grey: A roadmap to end aging

E.O. Wilson: Advice to a young scientist