Peter Donnelly: How juries are fooled by statistics

As other speakers have said, it's a rather daunting experience -- a particularly daunting experience -- to be speaking in front of this audience. But unlike the other speakers, I'm not going to tell you about the mysteries of the universe, or the wonders of evolution, or the really clever, innovative ways people are attacking the major inequalities in our world. Or even the challenges of nation-states in the modern global economy. My brief, as you've just heard, is to tell you about statistics -- and, to be more precise, to tell you some exciting things about statistics. And that's -- (Laughter) -- that's rather more challenging than all the speakers before me and all the ones coming after me. (Laughter) One of my senior colleagues told me, when I was a youngster in this profession, rather proudly, that statisticians were people who liked figures but didn't have the personality skills to become accountants. (Laughter) And there's another in-joke among statisticians, and that's, "How do you tell the introverted statistician from the extroverted statistician?" To which the answer is, "The extroverted statistician's the one who looks at the other person's shoes." (Laughter) But I want to tell you something useful -- and here it is, so concentrate now. This evening, there's a reception in the University's Museum of Natural History. And it's a wonderful setting, as I hope you'll find, and a great icon to the best of the Victorian tradition. It's very unlikely -- in this special setting, and this collection of people -- but you might just find yourself talking to someone you'd rather wish that you weren't. So here's what you do. When they say to you, "What do you do?" -- you say, "I'm a statistician." (Laughter) Well, except they've been pre-warned now, and they'll know you're making it up. And then one of two things will happen. They'll either discover their long-lost cousin in the other corner of the room and run over and talk to them. Or they'll suddenly become parched and/or hungry -- and often both -- and sprint off for a drink and some food. And you'll be left in peace to talk to the person you really want to talk to.

我像其他講者一樣，覺得在各位面前演講，是一件很令人害怕的事。但我不像其他演講者，我不會講述有關宇宙的奧妙，或是講述演化的神奇之處，我也不會講述那些人們用來對抗世上不公不義所採行的創新招術，甚至那些現代國家所需要面對的全球經濟問題，我會講的就是剛才主持人所提到的：統計學，正確地說，我會告訴各位統計學有趣之處，那就是... (笑聲) 這項挑戰可不亞於在我之前或在我之後出現的講者啊！ (笑聲) 有一位前輩在我剛加入這一行時很驕傲地告訴我，他說，統計學家是一群很喜歡數字的人，但卻不具備得以使他們成為會計師的人際關係技巧。 (笑聲) 還有另一個關於統計學家的笑話：「要怎麼分辨一個統計學家的個性是內向還是外向？」答案是：「外向的統計學家會盯著別人的鞋子看。」 (笑聲) 我要告訴各位一些有用的資訊，所以請專心一點。今晚，在學校的自然歷史博物館裡有一場招待會，我希望各位覺得辦得還不錯，主題是維多利亞時期的優良傳統。在這場盛會裡，聚集了很多人，但你有可能會和一個你根本不想說話的人對談，我給各位一點建議，當他們問說：「你做哪一行？」，你就回答：「我是個統計學家。」 (笑聲) 除非先前就有人告訴他們這個小伎倆，否則他們不會知道你在說謊。接下來就有二種可能，他們要不是會突然發現久未聯絡的表兄弟出現在大廳另一頭而趕去找他說話，要不就會突然覺得很渴或很餓，或是又渴又餓，不得不趕緊去找些東西來吃吃或喝喝。這時你就獲得自由了，你可以找你想要說話的人聊天了。

It's one of the challenges in our profession to try and explain what we do. We're not top on people's lists for dinner party guests and conversations and so on. And it's something I've never really found a good way of doing. But my wife -- who was then my girlfriend -- managed it much better than I've ever been able to. Many years ago, when we first started going out, she was working for the BBC in Britain, and I was, at that stage, working in America. I was coming back to visit her. She told this to one of her colleagues, who said, "Well, what does your boyfriend do?" Sarah thought quite hard about the things I'd explained -- and she concentrated, in those days, on listening. (Laughter) Don't tell her I said that. And she was thinking about the work I did developing mathematical models for understanding evolution and modern genetics. So when her colleague said, "What does he do?" She paused and said, "He models things." (Laughter) Well, her colleague suddenly got much more interested than I had any right to expect and went on and said, "What does he model?" Well, Sarah thought a little bit more about my work and said, "Genes." (Laughter) "He models genes."

做我們這一行的人，有時很難向別人解釋我們在做什麼，我們也不是別人晚宴賓客或是聊天的首選名單，甚至我自己也覺得很難説明我的工作内容。但我的太太，那時還是我的女友，倒是説明得比我還清楚。多年以前，當我們開始約會時，她那時在英國的BBC(英國廣播公司)工作，而我那時則在美國工作，有一次我要回來英國跟她見面。她和一個同事有了這樣的對話，對方問：「你男朋友是做什麼的？」於是莎拉把我之前對她解釋的工作內容再仔細地想了一遍，她在那時候都還很認真地聽我說話。 (笑聲) 不要告訴她我說過這件事。接著她想到我那時正在為解開演化與現代基因之謎建立一些數學模型，所以當她的同事問道：「他是做什麼的？」她停了好一會兒才說：「他是做模型的。」 (笑聲) 哇！她的同事突然對我所做的事感到高度興趣，接著問：「他做什麼模型？」莎拉想了一會兒，說：「基因。」 (笑聲) 「他為基因建立模型。」

That is my first love, and that's what I'll tell you a little bit about. What I want to do more generally is to get you thinking about the place of uncertainty and randomness and chance in our world, and how we react to that, and how well we do or don't think about it. So you've had a pretty easy time up till now -- a few laughs, and all that kind of thing -- in the talks to date. You've got to think, and I'm going to ask you some questions. So here's the scene for the first question I'm going to ask you. Can you imagine tossing a coin successively? And for some reason -- which shall remain rather vague -- we're interested in a particular pattern. Here's one -- a head, followed by a tail, followed by a tail.

莎拉是我的初戀，只能說到這裡了。接下來，我想讓各位想想，我們所處的世界是不是充滿了不確定性、各種隨機因素與機會？我們的反應又是如何，有或沒有意識到這件事呢？剛剛那幾分鐘各位都聽得很輕鬆，有笑話還有一些別的事情，但各位得動動腦，我要問各位幾個問題。在我問各位第一個問題之前，我要先請各位想像一下連續投擲幾次銅板的畫面。基於一些我們還無法解釋的因素，統計學家對於銅板正反面出現的次序很感興趣，例如：先是人頭、再來字、再來一次字。

So suppose we toss a coin repeatedly. Then the pattern, head-tail-tail, that we've suddenly become fixated with happens here. And you can count: one, two, three, four, five, six, seven, eight, nine, 10 -- it happens after the 10th toss. So you might think there are more interesting things to do, but humor me for the moment. Imagine this half of the audience each get out coins, and they toss them until they first see the pattern head-tail-tail. The first time they do it, maybe it happens after the 10th toss, as here. The second time, maybe it's after the fourth toss. The next time, after the 15th toss. So you do that lots and lots of times, and you average those numbers. That's what I want this side to think about.

假設我們不斷重覆投擲一個銅板，那麼「人頭、字、字」這個順序就是我們關注的重點，接下來你數：1, 2, 3, 4, 5, 6, 7, 8, 9, 10... 在第10次投擲時才出現。你一定在想這有什麼好玩的？但還是先遷就我一下。想像一下，這一半的聽眾都拿到一個銅板，開始投擲，要一直投到看到「人頭、字、字」這個順序為止。第一輪，或許就像我剛才說的，到第十次才看到，到了第二輪，或許在第四次會看到，第三輪，或許在第15次才看到。就這樣一直重覆做下去，然後把所有數字平均，這是我要這一半聽眾去想的事情。

The other half of the audience doesn't like head-tail-tail -- they think, for deep cultural reasons, that's boring -- and they're much more interested in a different pattern -- head-tail-head. So, on this side, you get out your coins, and you toss and toss and toss. And you count the number of times until the pattern head-tail-head appears and you average them. OK? So on this side, you've got a number -- you've done it lots of times, so you get it accurately -- which is the average number of tosses until head-tail-tail. On this side, you've got a number -- the average number of tosses until head-tail-head.

另外這一半的聽眾，我就不要你們做「人頭、字、字」了，基於深厚的文化因素，你們一定覺得，那種順序太無聊了，我們想要有趣一點的順序：「人頭、字、人頭」所以，這邊的聽眾，你們拿起了銅板，投了再投把第一次出現「人頭、字、人頭」這個順序的次數記錄下來，再算出平均數，好嗎？這一半的聽眾，你們有一個平均數，你們投過很多次，所以一定很準確，一定可以得出一個第一次出現「人頭、字、字」的平均數。而這一半聽眾，你們也有一個關於「人頭、字、人頭」的平均數。

So here's a deep mathematical fact -- if you've got two numbers, one of three things must be true. Either they're the same, or this one's bigger than this one, or this one's bigger than that one. So what's going on here? So you've all got to think about this, and you've all got to vote -- and we're not moving on. And I don't want to end up in the two-minute silence to give you more time to think about it, until everyone's expressed a view. OK. So what you want to do is compare the average number of tosses until we first see head-tail-head with the average number of tosses until we first see head-tail-tail.

因此我們可以得出一個深奧的數學理論：若你有二個數字，一定會有以下三種情形的其中之一，要不他們二個相等，要不就是這個數大於另一個數，要不就是另一個數大於這個數。你們覺得會是哪一種情形？大家得好好想一想，然後我要你們投票，現在就想一想。我可不想讓接下來的二分鐘冷場，所以我要你們都好好想一想，每個人都得表達出自己的意見。我要你們比較一下，第一次出現「人頭、字、人頭」的平均投擲數，和第一次出現「人頭、字、字」的平均投擲數孰大孰小。

Who thinks that A is true -- that, on average, it'll take longer to see head-tail-head than head-tail-tail? Who thinks that B is true -- that on average, they're the same? Who thinks that C is true -- that, on average, it'll take less time to see head-tail-head than head-tail-tail? OK, who hasn't voted yet? Because that's really naughty -- I said you had to. (Laughter) OK. So most people think B is true. And you might be relieved to know even rather distinguished mathematicians think that. It's not. A is true here. It takes longer, on average. In fact, the average number of tosses till head-tail-head is 10 and the average number of tosses until head-tail-tail is eight. How could that be? Anything different about the two patterns? There is. Head-tail-head overlaps itself. If you went head-tail-head-tail-head, you can cunningly get two occurrences of the pattern in only five tosses. You can't do that with head-tail-tail. That turns out to be important.

認為A是正確的請舉手？也就是說，平均下來要花較多時間才會看到「人頭、字、人頭」這種順序？認為B是正確的請舉手？就是二者平均數相等？認為C是正確的請舉手？也就是說，平均下來，要花較多時間才會看到「人頭、字、字」這種順序？還有誰沒投票？你們真的很不乖哦！我說過你們都得投票啊！ (笑聲) 好，大部分的人都認為B是正確的，如果你們知道最傑出的數學家也會這麼想，應該就會釋懷了吧！事實上不是，A才是正確的，平均來說會花比較多時間才會看到「人頭、字、人頭」這種順序。「人頭、字、人頭」的平均投擲次數是10次，而「人頭、字、字」的平均投擲次數則是8次。怎麼會這樣？這二種順序有什麼不同？的確有所不同，「人頭、字、人頭」的頭尾是重覆的，所以如果你投出「人頭、字、人頭、字、人頭」，在這五次投擲裡你就會看到二次這種順序，「人頭、字、字」就沒有這種重覆性，這是很重要的一點，

There are two ways of thinking about this. I'll give you one of them. So imagine -- let's suppose we're doing it. On this side -- remember, you're excited about head-tail-tail; you're excited about head-tail-head. We start tossing a coin, and we get a head -- and you start sitting on the edge of your seat because something great and wonderful, or awesome, might be about to happen. The next toss is a tail -- you get really excited. The champagne's on ice just next to you; you've got the glasses chilled to celebrate. You're waiting with bated breath for the final toss. And if it comes down a head, that's great. You're done, and you celebrate. If it's a tail -- well, rather disappointedly, you put the glasses away and put the champagne back. And you keep tossing, to wait for the next head, to get excited.

我們可以從二方面來思考這件事。我們來看看其中一個面向，先想像一下我們在投擲銅板，記住，這一邊是支持「人頭、字、字」的，這一邊是支持「人頭、字、人頭」的。我們來開始投吧！我們得到一個人頭，你緊張得坐不住了吧？因為有件很神奇的事情就要發生了！接下來投出一個字，你真的很興奮，似乎看到冰桶裡的香檳就在你身邊，只要拿起杯子就可以慶祝了！你現在不敢大口呼吸，如果最後出現一個人頭，那就太棒了！你成功了！你可以慶祝了！但如果是字，嗯，你會很失望，只好把杯子放回去，把香檳退掉，然後繼續投擲，等待下一個人頭出現。

On this side, there's a different experience. It's the same for the first two parts of the sequence. You're a little bit excited with the first head -- you get rather more excited with the next tail. Then you toss the coin. If it's a tail, you crack open the champagne. If it's a head you're disappointed, but you're still a third of the way to your pattern again. And that's an informal way of presenting it -- that's why there's a difference. Another way of thinking about it -- if we tossed a coin eight million times, then we'd expect a million head-tail-heads and a million head-tail-tails -- but the head-tail-heads could occur in clumps. So if you want to put a million things down amongst eight million positions and you can have some of them overlapping, the clumps will be further apart. It's another way of getting the intuition.

而這一邊，則是完全不同的際遇，頭二次投擲的結果都一樣，你對出現第一個人頭很興奮，接下來出現一個字讓你更加興奮，最後，你再投一次，如果是字，你就開香檳慶祝，如果是人頭，你就會很失望，但你至少不用再等下一個人頭，因為你已經投出下一輪的第一個人頭了。這不是正規的解釋方法，但這確實是他們之間的差異所在。現在我用另一個思考面向來解釋，如果我們投擲八百萬次，「人頭、字、人頭」應該會出現一百萬次，「人頭、字、字」也應該會出現一百萬次，但是「人頭、字、人頭」卻會成群地出現。如果你要把一百萬件東西分散放在八百萬件東西裡面，而某些東西是可以重疊的話，群集間的距離會更遠，這就是另一種思考方式。

What's the point I want to make? It's a very, very simple example, an easily stated question in probability, which every -- you're in good company -- everybody gets wrong. This is my little diversion into my real passion, which is genetics. There's a connection between head-tail-heads and head-tail-tails in genetics, and it's the following. When you toss a coin, you get a sequence of heads and tails. When you look at DNA, there's a sequence of not two things -- heads and tails -- but four letters -- As, Gs, Cs and Ts. And there are little chemical scissors, called restriction enzymes which cut DNA whenever they see particular patterns. And they're an enormously useful tool in modern molecular biology. And instead of asking the question, "How long until I see a head-tail-head?" -- you can ask, "How big will the chunks be when I use a restriction enzyme which cuts whenever it sees G-A-A-G, for example? How long will those chunks be?"

我到底想要說什麼？這是一個非常淺顯易懂的例子，很容易說明的機率問題，每一個人都會在這問題上犯錯，你們也不例外。這是我的另一個嗜好，基因。「人頭、字、人頭」或「人頭、字、字」和基因有某種關聯，當你投擲一個銅板，你會丟出一連串的人頭或字，而我們來看看DNA，它的組成就不是人頭或字，而是這四個字母：A, G, C, T。有一種像是剪刀的化學成份，叫做限制酶，會在他們看到某種特定順序組合出現時，將DNA切斷，這是現代分子生物學裡的一項強大工具。除了問說：「多久才會看到一個人頭、字、人頭呢？」你還可以問：「若限制酶在看到G-A-A-G出現時就切斷DNA，那麼G-A-A-G出現前的那一段DNA 會有多長呢？」

That's a rather trivial connection between probability and genetics. There's a much deeper connection, which I don't have time to go into and that is that modern genetics is a really exciting area of science. And we'll hear some talks later in the conference specifically about that. But it turns out that unlocking the secrets in the information generated by modern experimental technologies, a key part of that has to do with fairly sophisticated -- you'll be relieved to know that I do something useful in my day job, rather more sophisticated than the head-tail-head story -- but quite sophisticated computer modelings and mathematical modelings and modern statistical techniques. And I will give you two little snippets -- two examples -- of projects we're involved in in my group in Oxford, both of which I think are rather exciting. You know about the Human Genome Project. That was a project which aimed to read one copy of the human genome. The natural thing to do after you've done that -- and that's what this project, the International HapMap Project, which is a collaboration between labs in five or six different countries. Think of the Human Genome Project as learning what we've got in common, and the HapMap Project is trying to understand where there are differences between different people.

這是機率與基因間淺顯的關聯性，但他們之間還存在著很深的關係，今天我沒有足夠的時間可以說明，但那卻是現代基因學最令人著迷之處，待會兒還會有其他講者就這個主題再詳細說明。我們發現，若要公開現代實驗科技產生的資訊的祕密，就不得不提到一個很複雜的關鍵因素，各位會很高興知道我的工作還是有些用途的，這可比丟銅板複雜多了，牽涉到複雜的電腦模型、數學模型和現代的統計技巧。我會給各位二個提示，也就是二個例子，那是我在牛津的小組所參與的專案，這二個專案都很有趣。各位都知道人體基因元計畫，這個專案的目標是要訂出人體的基因序列，而接下來很自然就產生另一個專案，叫做國際單體型測繪計畫，由五、六個不同國家的實驗室共同合作執行。人體基因計畫旨在瞭解人類基因的共通性，而國際單體型測繪計畫就是要去瞭解不同人之間的基因有何相異之處。

Why do we care about that? Well, there are lots of reasons. The most pressing one is that we want to understand how some differences make some people susceptible to one disease -- type-2 diabetes, for example -- and other differences make people more susceptible to heart disease, or stroke, or autism and so on. That's one big project. There's a second big project, recently funded by the Wellcome Trust in this country, involving very large studies -- thousands of individuals, with each of eight different diseases, common diseases like type-1 and type-2 diabetes, and coronary heart disease, bipolar disease and so on -- to try and understand the genetics. To try and understand what it is about genetic differences that causes the diseases. Why do we want to do that? Because we understand very little about most human diseases. We don't know what causes them. And if we can get in at the bottom and understand the genetics, we'll have a window on the way the disease works, and a whole new way about thinking about disease therapies and preventative treatment and so on. So that's, as I said, the little diversion on my main love.

為什麼我們要知道這些？嗯，有許多原因，最主要的原因是我們想要瞭解，為何基因的不同會使某些人容易得某種疾病，例如第二型糖尿病，而另一種基因的差異則會讓人容易產生心臟病，或是中風、自閉症等疾病。這是一項大型專案，還有另一項大型專案，是由英國的衛爾康基金會出資運作，要進行非常大規模的研究，針對數千人進行調查，主要研究八種不同的疾病，像是第一型與第二型糖尿病、冠狀動脈心臟病、躁鬱症等，要研究病患的基因序列，試圖找出病患的基因有何不同之處。為什麼要做這個研究？因為我們對於大部分的疾病都瞭解不多，我們不知道人們是怎麼染病的，但如果我們能知道最基本的基因差異，我們或許可一窺疾病運作之祕密，並找出治療疾病的全新方法，加以預防。這就是我所說的我的第二個嗜好。

Back to some of the more mundane issues of thinking about uncertainty. Here's another quiz for you -- now suppose we've got a test for a disease which isn't infallible, but it's pretty good. It gets it right 99 percent of the time. And I take one of you, or I take someone off the street, and I test them for the disease in question. Let's suppose there's a test for HIV -- the virus that causes AIDS -- and the test says the person has the disease. What's the chance that they do? The test gets it right 99 percent of the time. So a natural answer is 99 percent. Who likes that answer? Come on -- everyone's got to get involved. Don't think you don't trust me anymore. (Laughter) Well, you're right to be a bit skeptical, because that's not the answer. That's what you might think. It's not the answer, and it's not because it's only part of the story. It actually depends on how common or how rare the disease is. So let me try and illustrate that. Here's a little caricature of a million individuals. So let's think about a disease that affects -- it's pretty rare, it affects one person in 10,000. Amongst these million individuals, most of them are healthy and some of them will have the disease. And in fact, if this is the prevalence of the disease, about 100 will have the disease and the rest won't. So now suppose we test them all. What happens? Well, amongst the 100 who do have the disease, the test will get it right 99 percent of the time, and 99 will test positive. Amongst all these other people who don't have the disease, the test will get it right 99 percent of the time. It'll only get it wrong one percent of the time. But there are so many of them that there'll be an enormous number of false positives. Put that another way -- of all of them who test positive -- so here they are, the individuals involved -- less than one in 100 actually have the disease. So even though we think the test is accurate, the important part of the story is there's another bit of information we need.

現在我們回歸到現實面，來看看剛才我所說的不確定性，我要問各位另一個問題，假設我們針對某項疾病研發了某種測試技術，雖然不是萬無一失，但尚稱良好，大約有99%的準確度。我請在座的一位或是街上隨便找個人，來用這種技術檢驗是否得到了這種疾病，假設是HIV病毒的檢驗試劑好了，就是愛滋病毒的檢驗試劑，報告出來說這個人得病了。那麼這個人真正得病的機率是多少？試劑有99%的準確度，大家自然會說這個人99%得了愛滋病，但誰會滿意這種答案？拜託，每一個人都要參與啊... 不要不信任我嘛... (笑聲) 抱持懷疑態度是對的，因為這個答案不對，你一定會這樣想。這個答案不對，但不是因為這個原因，而是要看這種疾病的普遍程度來決定，我來為各位解說一下。假設這裡有一百萬人，我們來假設一種很罕見的疾病，得病機率只有萬分之一，所以在這一百萬人裡，大部分的人都是健康的，只有少數人會得病。如果這種疾病流行起來，也只有100個人會生病，其餘的人則不會生病。假設我們對全部的人做檢驗，會有什麼結果？在這100個得病的人裡，以這99%準確度的試劑來檢驗，會有99個人呈陽性反應，而在其他沒有得病的人裡，這個試劑的準確度還是99%，有1%的機會會出錯，但因為人數很多，所以假陽性的數量也就跟著變多。換個方式來說，在所有呈陽性反應的人裡， 100個人裡只有不到一個人是真正染病的。即使我們認為這種試劑很準確，但重點是我們還需要其他資訊來確認，

Here's the key intuition. What we have to do, once we know the test is positive, is to weigh up the plausibility, or the likelihood, of two competing explanations. Each of those explanations has a likely bit and an unlikely bit. One explanation is that the person doesn't have the disease -- that's overwhelmingly likely, if you pick someone at random -- but the test gets it wrong, which is unlikely. The other explanation is that the person does have the disease -- that's unlikely -- but the test gets it right, which is likely. And the number we end up with -- that number which is a little bit less than one in 100 -- is to do with how likely one of those explanations is relative to the other. Each of them taken together is unlikely.

我們需要敏銳的洞察力。一旦我們發現有人呈陽性反應，我們就該去權衡二種不同解釋之間的可信度或可能性，每一種解釋都有可能的一面，也有不可能的一面。你可以說這個人沒有染病，這很有可能，因為你是隨機取樣的，也就是說試劑出錯了，但這種機會不大。你也可以說這個人確實是染病了，但這種疾病發生的機率很小，試劑確實是準確的，這確實很有可能發生。最後我們得到的數據是比1%還稍小一點，也就是這二種解釋的發生的比例（幾乎是一比一百），二者同時發生的可能性不高。

Here's a more topical example of exactly the same thing. Those of you in Britain will know about what's become rather a celebrated case of a woman called Sally Clark, who had two babies who died suddenly. And initially, it was thought that they died of what's known informally as "cot death," and more formally as "Sudden Infant Death Syndrome." For various reasons, she was later charged with murder. And at the trial, her trial, a very distinguished pediatrician gave evidence that the chance of two cot deaths, innocent deaths, in a family like hers -- which was professional and non-smoking -- was one in 73 million. To cut a long story short, she was convicted at the time. Later, and fairly recently, acquitted on appeal -- in fact, on the second appeal. And just to set it in context, you can imagine how awful it is for someone to have lost one child, and then two, if they're innocent, to be convicted of murdering them. To be put through the stress of the trial, convicted of murdering them -- and to spend time in a women's prison, where all the other prisoners think you killed your children -- is a really awful thing to happen to someone. And it happened in large part here because the expert got the statistics horribly wrong, in two different ways.

這裡還有一個很類似的例子，各位住在英國都知道一個很著名的案例，有個叫做莎莉．克拉克的婦人，她的二個嬰孩同時猝死，一開始大家都以為是猝死症，正式名稱為嬰兒猝死症候群。基於許多不同理由，莎莉被控謀殺，而在審判中，一位很知名的小兒科醫生做證說明，在他們這種家庭裡，也就是專業人士又不抽煙的家庭，二個嬰兒同時猝死的機率大約是7千3百萬分之一。長話短說，她後來被定罪了。但是後來，也就是最近的事，她在第二次上訴後獲判無罪。請各位想想一下，如果有人失去了一個孩子，或甚至二個孩子，以清白之身卻被判謀殺定罪，這是多麼殘忍的一件事。就只為了紓解法庭所承擔的壓力，就把一個人以謀殺犯定罪，把她關進女子監獄，那裡的犯人都認為你殺了自己的小孩，這真是一件悲慘絕倫的事。這個錯誤最主要是因為專家在二個不同的方面，大錯特錯地引用了統計數據所造成。

So where did he get the one in 73 million number? He looked at some research, which said the chance of one cot death in a family like Sally Clark's is about one in 8,500. So he said, "I'll assume that if you have one cot death in a family, the chance of a second child dying from cot death aren't changed." So that's what statisticians would call an assumption of independence. It's like saying, "If you toss a coin and get a head the first time, that won't affect the chance of getting a head the second time." So if you toss a coin twice, the chance of getting a head twice are a half -- that's the chance the first time -- times a half -- the chance a second time. So he said, "Here, I'll assume that these events are independent. When you multiply 8,500 together twice, you get about 73 million." And none of this was stated to the court as an assumption or presented to the jury that way. Unfortunately here -- and, really, regrettably -- first of all, in a situation like this you'd have to verify it empirically. And secondly, it's palpably false. There are lots and lots of things that we don't know about sudden infant deaths. It might well be that there are environmental factors that we're not aware of, and it's pretty likely to be the case that there are genetic factors we're not aware of. So if a family suffers from one cot death, you'd put them in a high-risk group. They've probably got these environmental risk factors and/or genetic risk factors we don't know about. And to argue, then, that the chance of a second death is as if you didn't know that information is really silly. It's worse than silly -- it's really bad science. Nonetheless, that's how it was presented, and at trial nobody even argued it. That's the first problem. The second problem is, what does the number of one in 73 million mean? So after Sally Clark was convicted -- you can imagine, it made rather a splash in the press -- one of the journalists from one of Britain's more reputable newspapers wrote that what the expert had said was, "The chance that she was innocent was one in 73 million." Now, that's a logical error. It's exactly the same logical error as the logical error of thinking that after the disease test, which is 99 percent accurate, the chance of having the disease is 99 percent. In the disease example, we had to bear in mind two things, one of which was the possibility that the test got it right or not. And the other one was the chance, a priori, that the person had the disease or not. It's exactly the same in this context. There are two things involved -- two parts to the explanation. We want to know how likely, or relatively how likely, two different explanations are. One of them is that Sally Clark was innocent -- which is, a priori, overwhelmingly likely -- most mothers don't kill their children. And the second part of the explanation is that she suffered an incredibly unlikely event. Not as unlikely as one in 73 million, but nonetheless rather unlikely. The other explanation is that she was guilty. Now, we probably think a priori that's unlikely. And we certainly should think in the context of a criminal trial that that's unlikely, because of the presumption of innocence. And then if she were trying to kill the children, she succeeded. So the chance that she's innocent isn't one in 73 million. We don't know what it is. It has to do with weighing up the strength of the other evidence against her and the statistical evidence. We know the children died. What matters is how likely or unlikely, relative to each other, the two explanations are. And they're both implausible. There's a situation where errors in statistics had really profound and really unfortunate consequences. In fact, there are two other women who were convicted on the basis of the evidence of this pediatrician, who have subsequently been released on appeal. Many cases were reviewed. And it's particularly topical because he's currently facing a disrepute charge at Britain's General Medical Council.

他怎麼得出7千3百萬分之一這個數據的？他看了某些研究文獻，裡頭說像莎莉這種家庭，一個嬰孩猝死的機率約為8千5百分之一。他說：「先假設家裡已經有一個嬰孩猝死了，第二個嬰孩猝死的機率與第一個相同。」這就是統計學所引用的獨立性假設，就好像是說：「若你第一次丟銅板得到一個人頭，並不會影響你第二次再丟銅板，得到人頭的機率。」所以，如果你丟一個銅板二次，那麼連丟二次都得到人頭的機率，就是第一次丟出銅板的機率，乘上第二次的機率（1/2＊1/2）。所以他才會說：「讓我們假設一下，假設這二個事件是獨立的，將8千5百乘二次，就會得到7千3百萬。」但是這個前題假設並沒有在法庭上說明，也沒有對陪審團說明。很不幸也很遺憾的是，首先，像這種情形就該憑經驗先進行驗證，第二，這很明顯就是錯的。我們對於嬰兒猝死症所知真的不多，有可能是因為某些我們並不瞭解的環境因素所造成，而這個個案更有可能是因為我們所不知道的基因缺陷所造成，所以當某個家庭裡有一個嬰孩猝死時，他們就算是高風險的家庭，有可能存在著某些環境風險因子，或是有我們不知道的基因缺陷，或是二者都有。真要計較起來，若完全不考慮這些因素，就來計算第二個嬰孩的猝死機率，是很可笑的。甚至比可笑還糟，簡直就是爛透了的科學證據。但這個數據就這樣被當成呈堂證供，法庭上也沒有人懷疑，這就是第一個問題。第二個問題是，7千3百萬分之一代表著什麼？當莎拉．克拉克被定罪之後，你可以想見又在媒體上掀起了多大的波瀾，英國某家聲譽卓著的報社記者就引用專家的話說：「莎拉清白的機率是7千3百萬之一」這犯了邏輯上的錯誤，這個錯誤就和我們剛才所談到的疾病測試一樣，同樣具有邏輯上的錯誤，有人會以為試劑有99%的準確度，得到這種疾病的機率就是99%。在疾病試劑的例子裡，我們得記住二件事，其中之一是試劑的準確度，另一個則是人們染病的先驗機率。這和這個案子是一樣的情形，這個案子也有二種解釋的方向，我們得釐清這二種解釋發生的機率。第一種解釋是莎拉是清白的，這在先驗機率上是很有可能的，大部分的母親都不會殺害自己的小孩。這種解釋的第二個部分是，莎拉的遭遇真的是令人難以置信，雖然機率不像7千3百萬分之一那麼小，但確實是不太可能。第二種解釋是莎拉確實是有罪的，就先驗機率來說，這不太可能，而且我們當然認為在這起犯罪的審判中，一開始就要假設被告是無罪的，所以說莎拉有罪並不太可能。但若她真的想要殺害小孩，她也成功了，所以她是清白的機率就不是7千3百萬分之一，沒人知道是多少，這個機率反而是和其他對她不利的證據和統計數據有關，得視證據強度而定。我們只知道嬰孩死了，重要的是要找出這二種解釋之間的關聯性。這二種解釋都無法使人信服，有時統計上的錯誤所造成的影響，是很深遠且會造成不幸的。事實上，還有有二位婦女因為這位小兒科醫生的證詞，而被判有罪，但在後來的上訴後又被無罪釋放。以往許多案子又被大家拿出來討論，因此又掀起一波話題，因為這個醫生正被英國醫藥委員會控以不名譽的罪名。

So just to conclude -- what are the take-home messages from this? Well, we know that randomness and uncertainty and chance are very much a part of our everyday life. It's also true -- and, although, you, as a collective, are very special in many ways, you're completely typical in not getting the examples I gave right. It's very well documented that people get things wrong. They make errors of logic in reasoning with uncertainty. We can cope with the subtleties of language brilliantly -- and there are interesting evolutionary questions about how we got here. We are not good at reasoning with uncertainty. That's an issue in our everyday lives. As you've heard from many of the talks, statistics underpins an enormous amount of research in science -- in social science, in medicine and indeed, quite a lot of industry. All of quality control, which has had a major impact on industrial processing, is underpinned by statistics. It's something we're bad at doing. At the very least, we should recognize that, and we tend not to. To go back to the legal context, at the Sally Clark trial all of the lawyers just accepted what the expert said. So if a pediatrician had come out and said to a jury, "I know how to build bridges. I've built one down the road. Please drive your car home over it," they would have said, "Well, pediatricians don't know how to build bridges. That's what engineers do." On the other hand, he came out and effectively said, or implied, "I know how to reason with uncertainty. I know how to do statistics." And everyone said, "Well, that's fine. He's an expert." So we need to understand where our competence is and isn't. Exactly the same kinds of issues arose in the early days of DNA profiling, when scientists, and lawyers and in some cases judges, routinely misrepresented evidence. Usually -- one hopes -- innocently, but misrepresented evidence. Forensic scientists said, "The chance that this guy's innocent is one in three million." Even if you believe the number, just like the 73 million to one, that's not what it meant. And there have been celebrated appeal cases in Britain and elsewhere because of that.

結論是，這個故事帶給我們什麼樣的啟示？我們知道隨機、不確定性及機率等，都是我們日常生活的一部分，而雖然我們每一個人都與眾不同，但就我所提出的問題沒有做出正確的回答這件事，這也是常態很多過去的記錄顯示人們確實有時會做出錯誤判斷。在不確定的情況下，人們會犯下合理的邏輯錯誤。人類可以運用精巧的語言，也能對人類本身的進化提出有趣的問題，但我們就是不擅長預測不確定性，這是我們每天都必須面對的問題。如同其他講者所提到的，統計學是其他許多科學研究的基礎，不管是社會科學還是醫學都一樣，還包括大部分的工業，那些品質控制理論，對於工業流程管制具有重大的影響，都是靠統計學做基礎。但這卻是我們所不擅長的事，至少我們該承認這一點，但我們卻沒人願意承認。回到法律層面，回到莎拉的案子上，所有的律師都接受這位專家的說法，所以如果有一位小兒科醫生站出來對陪審團說，「我知道如何建造橋樑，我已經在這條路上蓋了一座橋，請把你的車開上橋回家吧！」陪審團會說：「小兒科醫生不是建造橋樑的專家，這是工程師該做的事。」而在另一方面，這位醫師卻站出來發表專業意見，甚至暗示：「我知道如何解釋不確定性，我瞭解統計方法。」然後大家附和：「對，他是專家。」我們必須瞭解每一個人的專長為何，就像早期我們在描繪DNA時所引發的爭議一樣，有些科學家、律師，或甚至法官，都曾不斷地錯誤解讀他們所看到的證據。他們通常不是故意的，我們也衷心希望不是，但卻還是扭曲了證據的本質。鑑識專家說：「這傢伙清白的機率是三百萬分之一。」即使各位相信這個數據，就像先前提到的7千3百萬分之一那樣，但這數據的意義並非如此，在英國和其他地方，都有因為誤解數據而誤判的有名案例。

And just to finish in the context of the legal system. It's all very well to say, "Let's do our best to present the evidence." But more and more, in cases of DNA profiling -- this is another one -- we expect juries, who are ordinary people -- and it's documented they're very bad at this -- we expect juries to be able to cope with the sorts of reasoning that goes on. In other spheres of life, if people argued -- well, except possibly for politics -- but in other spheres of life, if people argued illogically, we'd say that's not a good thing. We sort of expect it of politicians and don't hope for much more. In the case of uncertainty, we get it wrong all the time -- and at the very least, we should be aware of that, and ideally, we might try and do something about it. Thanks very much.

再讓我們回過頭來看看我們的法庭，你大可以說：「我們得盡力將證據的原貌呈現出來。」但是在DNA描繪的案例裡，一次又一次我們看到，這是另一個案例，我們期望陪審團這些一般大眾，這些本來就對統計不甚在行在大眾，我們竟然期望他們能解讀這些統計數據。但在現實生活裡，如果有人爭論...嗯，除了政治話題之外，在現實生活裡，如果有人不合邏輯地爭論，我們會說這樣做不好，我們會認為這是政客做的事，因為我們對政客沒什麽太大的期望。在面對不確定的事情時，我們總是犯錯，但是至少我們應該知道我們會犯錯。並希望我們能嘗試去減少錯誤的發生。謝謝各位！

Peter Donnelly: How juries are fooled by statistics

Peter Donnelly: How juries are fooled by statistics

Related talks

Hans Rosling: The best stats you've ever seen

Michael Shermer: Why people believe weird things

Emily Oster: Flip your thinking on AIDS in Africa

Robert Full: Learning from the gecko's tail

Aubrey de Grey: A roadmap to end aging

E.O. Wilson: Advice to a young scientist

Related talks

Hans Rosling: The best stats you've ever seen

Michael Shermer: Why people believe weird things

Emily Oster: Flip your thinking on AIDS in Africa

Robert Full: Learning from the gecko's tail

Aubrey de Grey: A roadmap to end aging

E.O. Wilson: Advice to a young scientist