Peter Donnelly: How juries are fooled by statistics

As other speakers have said, it's a rather daunting experience -- a particularly daunting experience -- to be speaking in front of this audience. But unlike the other speakers, I'm not going to tell you about the mysteries of the universe, or the wonders of evolution, or the really clever, innovative ways people are attacking the major inequalities in our world. Or even the challenges of nation-states in the modern global economy. My brief, as you've just heard, is to tell you about statistics -- and, to be more precise, to tell you some exciting things about statistics. And that's -- (Laughter) -- that's rather more challenging than all the speakers before me and all the ones coming after me. (Laughter) One of my senior colleagues told me, when I was a youngster in this profession, rather proudly, that statisticians were people who liked figures but didn't have the personality skills to become accountants. (Laughter) And there's another in-joke among statisticians, and that's, "How do you tell the introverted statistician from the extroverted statistician?" To which the answer is, "The extroverted statistician's the one who looks at the other person's shoes." (Laughter) But I want to tell you something useful -- and here it is, so concentrate now. This evening, there's a reception in the University's Museum of Natural History. And it's a wonderful setting, as I hope you'll find, and a great icon to the best of the Victorian tradition. It's very unlikely -- in this special setting, and this collection of people -- but you might just find yourself talking to someone you'd rather wish that you weren't. So here's what you do. When they say to you, "What do you do?" -- you say, "I'm a statistician." (Laughter) Well, except they've been pre-warned now, and they'll know you're making it up. And then one of two things will happen. They'll either discover their long-lost cousin in the other corner of the room and run over and talk to them. Or they'll suddenly become parched and/or hungry -- and often both -- and sprint off for a drink and some food. And you'll be left in peace to talk to the person you really want to talk to.

כפי שאמרו לפני, זה די מלחיץ -- בעצם, מלחיץ מאוד -- לדבר לפני הקהל הזה. אבל בניגוד לדוברים אחרים, אני לא עומד לספר לכם על מסתרי היקום, על פלאי האבולוציה, או על דרכים חכמות וחדשניות להתמודדות עם חוסר השיוויון העולמי. ואפילו לא על האתגרים של מדינות-הלאום בעידן הכלכלה הגלובלית המודרנית. כפי ששמעתם, אני עומד לספר לכם על סטטיסטיקה -- ואם נדייק, לספר לכם כמה דברים מרתקים על סטטיסטיקה. וזה -- [צחוק] -- זהו אתגר גדול יותר מזה של כל הדוברים שהיו לפני וכל אלה שיגיעו בהמשך. [צחוק] אחד מעמיתי הבכירים אמר לי בגאווה רבה, בתחילת דרכי במקצוע הזה, שסטטיסטיקאים הם אנשים שאוהבים מספרים, אבל אין להם כישורי האישיות הדרושים להיות רואי חשבון. [צחוק] ויש עוד בדיחה פנימית של סטטיסטיקאים, "איך מבדילים בין סטטיסטיקאי מופנם לסטטיסטיקאי מוחצן?" והתשובה היא, "הסטטיסטיקאי המוחצן הוא זה שבוחן נעליים של אנשים אחרים." [צחוק] אבל אני רוצה לדבר על משהו שימושי - - אז כדאי להתרכז עכשיו. הערב נערכת קבלת פנים במוזיאון להיסטוריה של הטבע של האוניברסיטה. זהו אתר נפלא, כפי שאני מקווה שתגלו, וסמל חשוב למיטב המסורת הויקטוריאנית. זה מאוד לא סביר -- באתר המיוחד הזה ועם אוסף האנשים הזה -- אבל אתם עלולים למצוא את עצמכם מדברים עם מישהו שאתם מעדיפים להמנע משיחה איתו. אז זה מה שאתם צריכים לעשות. כששואלים אתכם, "במה אתם עוסקים?" -- תגידו "אני סטטיסטיקאי." [צחוק] אבל עכשיו הם כבר קיבלו אזהרה מוקדמת וידעו שאתם מבלפים. ואז יש שתי אפשרויות. הם יגלו את הדודן האבוד שלהם בפינה הרחוקה של החדר וירוצו לדבר איתו. או שהם יזכרו באופן פתאומי שהם מתים מצמא ו/או מרעב ויפתחו בריצה קלה לעבר המשקאות והמזון. ואז תהיו חופשיים לדבר עם מי שבאמת תרצו לדבר איתו.

It's one of the challenges in our profession to try and explain what we do. We're not top on people's lists for dinner party guests and conversations and so on. And it's something I've never really found a good way of doing. But my wife -- who was then my girlfriend -- managed it much better than I've ever been able to. Many years ago, when we first started going out, she was working for the BBC in Britain, and I was, at that stage, working in America. I was coming back to visit her. She told this to one of her colleagues, who said, "Well, what does your boyfriend do?" Sarah thought quite hard about the things I'd explained -- and she concentrated, in those days, on listening. (Laughter) Don't tell her I said that. And she was thinking about the work I did developing mathematical models for understanding evolution and modern genetics. So when her colleague said, "What does he do?" She paused and said, "He models things." (Laughter) Well, her colleague suddenly got much more interested than I had any right to expect and went on and said, "What does he model?" Well, Sarah thought a little bit more about my work and said, "Genes." (Laughter) "He models genes."

אחד מאתגרי המקצוע הוא לנסות להסביר מה אנחנו עושים. אנחנו לא בראש רשימות האורחים והנושאים לשיחה בסעודות חגיגיות. וזה משהו שמעולם לא מצאתי דרך טובה לבצע. אבל אשתי -- שהייתה אז החברה שלי -- טיפלה בזה הרבה יותר טוב ממני. לפני הרבה שנים, כשהתחלנו לצאת, היא עבדה עבור הבי.בי.סי. בבריטניה, ובאותו שלב, אני עבדתי באמריקה. חזרתי כדי לבקר אותה. היא סיפרה על כך לאחת מחברותיה לעבודה ששאלה, "במה עוסק החבר שלך?" שרה חשבה עמוקות על הדברים שהסברתי -- באותם הימים היא התרכזה בהקשבה. [צחוק] אל תגלו לה שאמרתי את זה. היא חשבה על העבודה שלי בפיתוח מודלים מתמטיים להבנה של אבולוציה וגנטיקה מודרנית. וכאשר העמיתה שלה שאלה, "במה הוא עוסק?" היא עצרה רגע ואמרה, "הוא מדגמן (=בונה דגמים של) דברים." [צחוק] העמיתה שלה גילתה פתאום עניין רב שלא הייתי ראוי לו ושאלה, "מה הוא מדגמן?" שרה חשבה עוד קצת על העבודה שלי וענתה "ג'ינס (=גנים)." [צחוק] "הוא מדגמן ג'ינס."

That is my first love, and that's what I'll tell you a little bit about. What I want to do more generally is to get you thinking about the place of uncertainty and randomness and chance in our world, and how we react to that, and how well we do or don't think about it. So you've had a pretty easy time up till now -- a few laughs, and all that kind of thing -- in the talks to date. You've got to think, and I'm going to ask you some questions. So here's the scene for the first question I'm going to ask you. Can you imagine tossing a coin successively? And for some reason -- which shall remain rather vague -- we're interested in a particular pattern. Here's one -- a head, followed by a tail, followed by a tail.

וזוהי אהבתי הראשונה, עליה אדבר קצת עכשיו. וביתר כלליות, אני רוצה שתחשבו על המקום של חוסר הוודאות, האקראיות והמקרה בעולמנו, איך אנחנו מגיבים להם, ועד כמה אנו חושבים עליהם בצורה נכונה. בהרצאות עד עכשיו היה לכם די קל -- קצת צחוקים, ודברים כאלה. עכשיו תצטרכו להפעיל את המוח, ואני עומד לשאול אתכם כמה שאלות. זאת התפאורה לשאלה הראשונה שאציג בפניכם. תוכלו לדמיין את עצמכם מטילים מטבע ברצף? מסיבה כלשהי -- שתשאר די מעורפלת -- אנחנו מתעניינים בדפוס מסויים. הנה דפוס - עץ, אחריו פלי, ואחריו פלי.

So suppose we toss a coin repeatedly. Then the pattern, head-tail-tail, that we've suddenly become fixated with happens here. And you can count: one, two, three, four, five, six, seven, eight, nine, 10 -- it happens after the 10th toss. So you might think there are more interesting things to do, but humor me for the moment. Imagine this half of the audience each get out coins, and they toss them until they first see the pattern head-tail-tail. The first time they do it, maybe it happens after the 10th toss, as here. The second time, maybe it's after the fourth toss. The next time, after the 15th toss. So you do that lots and lots of times, and you average those numbers. That's what I want this side to think about.

נניח שאנחנו חוזרים על הטלת מטבע. ואז הדפוס של עץ-פלי-פלי, שפתאום התחלנו לגלות בו ענין רב, מופיע כאן. ואפשר לספור: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 -- הוא מופיע אחרי ההטלה ה-10. אולי תחשבו שיש דברים יותר מעניינים לעשות, אבל תשארו איתי רגע. דמיינו שהחצי הזה של הקהל מקבל מטבעות ומטיל אותם עד הפעם הראשונה בה מופיע עץ-פלי-פלי. בפעם הראשונה אולי זה יופיע אחרי ההטלה ה-10, כמו כאן. בפעם השנייה אולי אחרי ההטלה ה-4. בפעם הבאה, אחרי ההטלה ה-15. תחזרו על כך הרבה פעמים, ותחשבו ממוצע של המספרים האלה. זה מה שאני רוצה שהצד הזה יחשוב עליו.

The other half of the audience doesn't like head-tail-tail -- they think, for deep cultural reasons, that's boring -- and they're much more interested in a different pattern -- head-tail-head. So, on this side, you get out your coins, and you toss and toss and toss. And you count the number of times until the pattern head-tail-head appears and you average them. OK? So on this side, you've got a number -- you've done it lots of times, so you get it accurately -- which is the average number of tosses until head-tail-tail. On this side, you've got a number -- the average number of tosses until head-tail-head.

החצי השני של הקהל לא מחבב את עץ-פלי-פלי -- הם חושבים, מסיבות תרבותיות עמוקות, שזהו דפוס משעמם -- והם מגלים עניין רב בדפוס אחר -- עץ-פלי-עץ. אז, בצד הזה, תקבלו את המטבעות שלכם, ותטילו אותם שוב ושוב ושוב. ואתם תספרו את מספר ההטלות עד שהדפוס עץ-פלי-עץ מופיע ותחשבו ממוצע שלהן. אוקי? אז בצד הזה, יש לכם מספר -- עשיתם זאת הרבה פעמים, אז קיבלתם תוצאה מדוייקת -- שהיא מספר ההטלות הממוצע עד לעץ-פלי-פלי. ובצד הזה, יש לכם מספר -- מספר ההטלות הממוצע עד לעץ-פלי-עץ.

So here's a deep mathematical fact -- if you've got two numbers, one of three things must be true. Either they're the same, or this one's bigger than this one, or this one's bigger than that one. So what's going on here? So you've all got to think about this, and you've all got to vote -- and we're not moving on. And I don't want to end up in the two-minute silence to give you more time to think about it, until everyone's expressed a view. OK. So what you want to do is compare the average number of tosses until we first see head-tail-head with the average number of tosses until we first see head-tail-tail.

עובדה מתמטית עמוקה -- אם יש לכם 2 מספרים, אחד מ-3 הדברים הבאים חייב להתקיים. או שהם זהים, או שזה יותר גדול מזה, או שזה יותר גדול מזה. מה קורה כאן? כולכם צריכים לחשוב, וכולכם צריכים להצביע -- ואנחנו לא מתקדמים. ולא הייתי רוצה שתיווצר שתיקה של 2 דקות כדי שיהיה לכם יותר זמן לחשוב על כך, עד שכולם יחוו את דעתם. נרצה להשוות את מספר ההטלות הממוצע עד שנראה עץ-פלי-עץ בפעם הראשונה למספר ההטלות הממוצע עד שנראה עץ-פלי-פלי בפעם הראשונה.

Who thinks that A is true -- that, on average, it'll take longer to see head-tail-head than head-tail-tail? Who thinks that B is true -- that on average, they're the same? Who thinks that C is true -- that, on average, it'll take less time to see head-tail-head than head-tail-tail? OK, who hasn't voted yet? Because that's really naughty -- I said you had to. (Laughter) OK. So most people think B is true. And you might be relieved to know even rather distinguished mathematicians think that. It's not. A is true here. It takes longer, on average. In fact, the average number of tosses till head-tail-head is 10 and the average number of tosses until head-tail-tail is eight. How could that be? Anything different about the two patterns? There is. Head-tail-head overlaps itself. If you went head-tail-head-tail-head, you can cunningly get two occurrences of the pattern in only five tosses. You can't do that with head-tail-tail. That turns out to be important.

מי חושב שא' נכון -- כלומר, בממוצע, ייקח יותר זמן לראות עץ-פלי-עץ מאשר עץ-פלי-פלי? מי חושב שב' נכון -- שבממוצע הם זהים? מי חושב שג' נכון -- שבממוצע זה יקח פחות פעמים לראות עץ-פלי-עץ מאשר עץ-פלי-פלי? מי עדיין לא הצביע? שובבים -- אמרתי שחייבים להצביע. [צחוק] רוב האנשים חושבים שב' נכון. ואולי תשמחו לשמוע שמתמטיקאים די מכובדים חושבים כמוכם. אבל לא. א' היא התשובה הנכונה. זה לוקח יותר זמן, בממוצע. למעשה, מספר ההטלות הממוצע עד עץ-פלי-עץ הוא 10 ומספר ההטלות הממוצע עד עץ-פלי-פלי הוא 8. איך זה ייתכן? יש הבדל בין שני הדפוסים? והתשובה היא כן. עץ-פלי-עץ חופף את עצמו. אם יוצא לכם עץ-פלי-עץ-פלי-עץ, תוכלו לקבל בערמומיות 2 מופעים של הדפוס ב-5 הטלות בלבד. לא ניתן לעשות זאת עם עץ-פלי-פלי. ומסתבר שזה חשוב.

There are two ways of thinking about this. I'll give you one of them. So imagine -- let's suppose we're doing it. On this side -- remember, you're excited about head-tail-tail; you're excited about head-tail-head. We start tossing a coin, and we get a head -- and you start sitting on the edge of your seat because something great and wonderful, or awesome, might be about to happen. The next toss is a tail -- you get really excited. The champagne's on ice just next to you; you've got the glasses chilled to celebrate. You're waiting with bated breath for the final toss. And if it comes down a head, that's great. You're done, and you celebrate. If it's a tail -- well, rather disappointedly, you put the glasses away and put the champagne back. And you keep tossing, to wait for the next head, to get excited.

יש 2 דרכים לחשוב על כך. אציג בפניכם אחת מהן. דמיינו לעצמכם -- נניח שאנחנו מבצעים זאת. בצד הזה -- אתם זוכרים, אתם מתלהבים מעץ-פלי-פלי, ואתם מתלהבים מעץ-פלי-עץ. אנחנו מתחילים להטיל מטבע, וקיבלנו עץ -- אתם יושבים על קצה הכסא כי ייתכן שמשהו ענק ונפלא, או עצום וכביר, עומד להתרחש. ההטלה הבאה היא פלי -- אתם ממש מתלהבים. השמפניה בקרח מוכנה, הגביעים כבר בקירור לקראת החגיגה. אתם מחכים בנשימה עצורה להטלה הסופית. אם יוצא עץ, זה נפלא. סיימתם ואתם חוגגים. אם יוצא פלי -- אז, באכזבה רבה, אתם מניחים את הגביעים ואת השמפניה בצד. ואתם ממשיכים להטיל, לחכות לעץ הבא, להתלהב.

On this side, there's a different experience. It's the same for the first two parts of the sequence. You're a little bit excited with the first head -- you get rather more excited with the next tail. Then you toss the coin. If it's a tail, you crack open the champagne. If it's a head you're disappointed, but you're still a third of the way to your pattern again. And that's an informal way of presenting it -- that's why there's a difference. Another way of thinking about it -- if we tossed a coin eight million times, then we'd expect a million head-tail-heads and a million head-tail-tails -- but the head-tail-heads could occur in clumps. So if you want to put a million things down amongst eight million positions and you can have some of them overlapping, the clumps will be further apart. It's another way of getting the intuition.

בצד הזה, ההתנסות היא שונה. זה אותו הדבר בשני החלקים הראשונים של הסדרה. אתם קצת מתרגשים כשמופיע העץ הראשון -- מתלהבים קצת יותר עם הפלי הבא. ואז אתם מטילים את המטבע. אם יצא פלי, פותחים את השמפניה. אם יצא עץ, אתם מאוכזבים, אבל אתם כבר בשליש הדרך לקראת הדפוס הבא שלכם. זוהי דרך לא פורמלית להציג זאת -- אבל זאת הסיבה להבדל. דרך אחרת לחשוב על כך -- אם היינו מטילים מטבע 8 מיליון פעמים, היינו מצפים למיליון עץ-פלי-עץ ולמיליון עץ-פלי-פלי -- אבל העץ-פלי-עץ היו יכולים להופיע במקבצים. אז אם רוצים לשים מיליון דברים בין 8 מיליון מקומות כשחלקם יכולים להיות חופפים, הרווחים בין המקבצים יהיו גדולים יותר. זוהי דרך אחת להסבר אינטואיטיבי.

What's the point I want to make? It's a very, very simple example, an easily stated question in probability, which every -- you're in good company -- everybody gets wrong. This is my little diversion into my real passion, which is genetics. There's a connection between head-tail-heads and head-tail-tails in genetics, and it's the following. When you toss a coin, you get a sequence of heads and tails. When you look at DNA, there's a sequence of not two things -- heads and tails -- but four letters -- As, Gs, Cs and Ts. And there are little chemical scissors, called restriction enzymes which cut DNA whenever they see particular patterns. And they're an enormously useful tool in modern molecular biology. And instead of asking the question, "How long until I see a head-tail-head?" -- you can ask, "How big will the chunks be when I use a restriction enzyme which cuts whenever it sees G-A-A-G, for example? How long will those chunks be?"

מה אני רוצה להגיד? זוהי דוגמא מאוד מאוד פשוטה, שאלה בהסתברות שמנוסחת בקלות, שכולם -- אתם בחברה טובה -- כולם טועים בה. זוהי סטייה קטנה לתשוקה האמיתית שלי - גנטיקה. יש קשר בין עץ-פלי-עץ ועץ-פלי-פלי בגנטיקה ואציג אותו בפניכם. כאשר מטילים מטבע, מקבלים סדרה של עצים ופלים. כשמתבוננים בדי.אן.איי., יש סדרות שאינן של 2 דברים -- עצים ופלים -- אלא של 4 אותיות -- A-ים, G-ים, C-ים ו-T-ים. יש מספריים כימיים קטנים שנקראים אנזימי הגבלה שחותכים את הדי.אן.איי. כשהם נתקלים בדפוסים מסויימים. הם כלי שימושי ביותר בביולוגיה מולקולרית מודרנית. ובמקום לשאול, "כמה זמן יקח עד שנראה עץ-פלי-עץ?" -- אפשר לשאול, "מה יהיה גודל החתיכות כשאשתמש באנזים הגבלה שחותך בכל פעם שהוא מוצא G-A-A-G, לדוגמא? מה יהיה אורך החתיכות?"

That's a rather trivial connection between probability and genetics. There's a much deeper connection, which I don't have time to go into and that is that modern genetics is a really exciting area of science. And we'll hear some talks later in the conference specifically about that. But it turns out that unlocking the secrets in the information generated by modern experimental technologies, a key part of that has to do with fairly sophisticated -- you'll be relieved to know that I do something useful in my day job, rather more sophisticated than the head-tail-head story -- but quite sophisticated computer modelings and mathematical modelings and modern statistical techniques. And I will give you two little snippets -- two examples -- of projects we're involved in in my group in Oxford, both of which I think are rather exciting. You know about the Human Genome Project. That was a project which aimed to read one copy of the human genome. The natural thing to do after you've done that -- and that's what this project, the International HapMap Project, which is a collaboration between labs in five or six different countries. Think of the Human Genome Project as learning what we've got in common, and the HapMap Project is trying to understand where there are differences between different people.

זהו קשר די טריוויאלי בין הסתברות וגנטיקה. יש קשר הרבה יותר עמוק, שאין לי זמן להכנס אליו, והוא שגנטיקה מודרנית היא שטח מדעי מאוד מרתק. בהמשך הועידה נשמע כמה הרצאות על הנושא. אבל מתברר שחשיפת הסודות בנתונים שמיוצרים ע"י טכנולוגיות ניסיוניות מודרניות, מרכיב מפתח בחשיפה זו -- ודאי תשמחו לשמוע שאני עוסק בדברים שימושיים בעבודת היום-יום שלי, מורכבים יותר מסיפור העץ-פלי-עץ -- מודלים מחשביים ומודלים מתמטיים וטכניקות סטטיסטיות מודרניות מתוחכמים למדי. אציג בפניכם 2 דוגמאות לפרוייקטים שהקבוצה שלי מאוקספורד מעורבת בהם. אני חושב ששניהם מרתקים למדי. ודאי שמעתם על פרוייקט הגנום האנושי. זהו פרייקט שמטרתו הייתה קריאת עותק אחד של הגנום האנושי. לאחר שהפרוייקט הושלם - הדבר הטבעי לעשותו הוא מה שעשה פרוייקט ה-HapMap הבינלאומי, שהוא שיתוף פעולה בין מעבדות מ-5 או 6 מדינות שונות. חישבו על פרוייקט הגנום האנושי כעל למידה של מה שמשותף לנו. פרוייקט ה-HapMap מנסה להבין איפה קיימים ההבדלים בין האנשים השונים.

Why do we care about that? Well, there are lots of reasons. The most pressing one is that we want to understand how some differences make some people susceptible to one disease -- type-2 diabetes, for example -- and other differences make people more susceptible to heart disease, or stroke, or autism and so on. That's one big project. There's a second big project, recently funded by the Wellcome Trust in this country, involving very large studies -- thousands of individuals, with each of eight different diseases, common diseases like type-1 and type-2 diabetes, and coronary heart disease, bipolar disease and so on -- to try and understand the genetics. To try and understand what it is about genetic differences that causes the diseases. Why do we want to do that? Because we understand very little about most human diseases. We don't know what causes them. And if we can get in at the bottom and understand the genetics, we'll have a window on the way the disease works, and a whole new way about thinking about disease therapies and preventative treatment and so on. So that's, as I said, the little diversion on my main love.

למה זה מעניין אותנו? מסיבות רבות. הסיבה הבוערת ביותר היא שאנו רוצים להבין איך הבדלים מסויימים גורמים לאנשים מסויימים להיות רגישים למחלה מסויימת - סוכרת סוג 2, לדוגמא, והבדלים אחרים גורמים לאנשים להיות יותר רגישים למחלות לב, או לשבץ, או לאוטיזם וכו'. זה פרוייקט גדול אחד. פרוייקט גדול נוסף, שנוסד לאחרונה ע"י וולקם טרסט במדינה הזאת, כולל מחקרים רחבי היקף -- אלפי אנשים, הסובלים מאחת מ-8 מחלות שונות, מחלות שכיחות כמו סוכרת סוג 1 וסוג 2, ומחלת לב כלילית, הפרעה דו-קוטבית וכו'. מחקרים אלו מנסים להבין את הגנטיקה. הם מנסים להבין איזה רכיב בהבדלים הגנטיים גורם למחלות. למה אנחנו רוצים לעשות זאת? מכיוון שאנחנו מבינים מעט מאוד על רוב המחלות האנושיות. אנחנו לא יודעים מה גורם להן. ואם נצליח לרדת לשורש העניין ולהבין את הגנטיקה, יהיה לנו חלון הצצה לאופן בו המחלה פועלת, ודרך חדשה לגמרי לחשוב על טיפול במחלות ועל טיפול מונע וכו'. וכמו שאמרתי, זאת הייתה סטייה קטנה אל אהבתי העיקרית.

Back to some of the more mundane issues of thinking about uncertainty. Here's another quiz for you -- now suppose we've got a test for a disease which isn't infallible, but it's pretty good. It gets it right 99 percent of the time. And I take one of you, or I take someone off the street, and I test them for the disease in question. Let's suppose there's a test for HIV -- the virus that causes AIDS -- and the test says the person has the disease. What's the chance that they do? The test gets it right 99 percent of the time. So a natural answer is 99 percent. Who likes that answer? Come on -- everyone's got to get involved. Don't think you don't trust me anymore. (Laughter) Well, you're right to be a bit skeptical, because that's not the answer. That's what you might think. It's not the answer, and it's not because it's only part of the story. It actually depends on how common or how rare the disease is. So let me try and illustrate that. Here's a little caricature of a million individuals. So let's think about a disease that affects -- it's pretty rare, it affects one person in 10,000. Amongst these million individuals, most of them are healthy and some of them will have the disease. And in fact, if this is the prevalence of the disease, about 100 will have the disease and the rest won't. So now suppose we test them all. What happens? Well, amongst the 100 who do have the disease, the test will get it right 99 percent of the time, and 99 will test positive. Amongst all these other people who don't have the disease, the test will get it right 99 percent of the time. It'll only get it wrong one percent of the time. But there are so many of them that there'll be an enormous number of false positives. Put that another way -- of all of them who test positive -- so here they are, the individuals involved -- less than one in 100 actually have the disease. So even though we think the test is accurate, the important part of the story is there's another bit of information we need.

נחזור לכמה מהנושאים היותר ארציים של חשיבה על חוסר ודאות. הנה עוד חידה עבורכם -- נניח שיש לנו בדיקה למחלה שלא חסינה מפני טעויות, אבל היא די טובה. התוצאות שלה תקינות 99 אחוז מהזמן. ואני לוקח אחד מכם, או מישהו מהרחוב, ובודק אם הוא חולה במחלה הנידונה. נניח שזוהי בדיקה ל-HIV -- הוירוס שגורם לאיידס -- ושהבדיקה אומרת שהאדם חולה. מה הסיכוי שהוא אכן חולה? תוצאות הבדיקה תקינות 99 אחוז מהזמן. לכן תשובה טבעית תהיה 99 אחוז. מי אוהב את התשובה הזאת? קדימה -- כולם צריכים לענות. אל תחשבו שאתם כבר לא בוטחים בי. [צחוק] אתם צודקים אם אתם קצת ספקניים, כי זאת לא התשובה הנכונה. זה מה שאתם אולי חושבים. זאת לא התשובה, אבל לא בגלל שזה רק חלק מהסיפור. למעשה, זה תלוי במידה בה המחלה היא שכיחה או נדירה. אנסה להמחיש זאת עבורכם. זוהי קריקטורה קטנה של מיליון אנשים. נחשוב על מחלה די נדירה, שמשפיעה על אדם אחד מתוך 10,000. בין מיליון האנשים האלה, רובם בריאים וחלקם יסבלו מהמחלה. למעשה, אם זוהי השכיחות של המחלה, כ-100 יהיו חולים וכל השאר יהיו בריאים. נניח שאנחנו בודקים את כולם. מה יקרה? מבין ה-100 שחולים, הבדיקה תתן תוצאה תקינה 99 אחוז מהזמן, ול-99 מהם התשובה תהיה חיובית. מבין האחרים שלא חולים, הבדיקה תתן תשובה תקינה 99 אחוז מהזמן. היא תתן תוצאה שגויה רק באחוז אחד מהזמן. אבל מכיוון שיש כל כך הרבה אנשים - יהיה מספר עצום של תשובות חיוביות מוטעות. ובניסוח אחר -- מבין כל אלה שמקבלים תשובה חיובית -- הנה הם -- פחות מאחד ממאה אכן חולה. ולמרות שאנחנו חושבים שהבדיקה מדוייקת, החלק החשוב של הסיפור הוא שישנם נתונים נוספים שאנו זקוקים להם.

Here's the key intuition. What we have to do, once we know the test is positive, is to weigh up the plausibility, or the likelihood, of two competing explanations. Each of those explanations has a likely bit and an unlikely bit. One explanation is that the person doesn't have the disease -- that's overwhelmingly likely, if you pick someone at random -- but the test gets it wrong, which is unlikely. The other explanation is that the person does have the disease -- that's unlikely -- but the test gets it right, which is likely. And the number we end up with -- that number which is a little bit less than one in 100 -- is to do with how likely one of those explanations is relative to the other. Each of them taken together is unlikely.

זוהי האינטואיציה המובילה. ברגע שאנחנו יודעים שהבדיקה חיובית, עלינו לשקול את הסבירות, או הסיכוי, של שני הסברים אפשריים. לכל אחד מההסברים יש חלק סביר וחלק לא סביר. הסבר אחד הוא שהאדם לא חולה -- זה סביר ביותר, אם בוחרים מישהו באופן אקראי -- אבל הבדיקה טועה - דבר לא סביר. ההסבר השני הוא שהאדם חולה -- יש לכך סבירות נמוכה -- והבדיקה תקינה - לכך יש סבירות גבוהה. והמספר שאנחנו מקבלים בסופו של דבר -- המספר שקצת יותר קטן מ-1 ל-100 -- אומר מה הסבירות של הסבר אחד לעומת השני. לכל אחד מהם סבירות נמוכה.

Here's a more topical example of exactly the same thing. Those of you in Britain will know about what's become rather a celebrated case of a woman called Sally Clark, who had two babies who died suddenly. And initially, it was thought that they died of what's known informally as "cot death," and more formally as "Sudden Infant Death Syndrome." For various reasons, she was later charged with murder. And at the trial, her trial, a very distinguished pediatrician gave evidence that the chance of two cot deaths, innocent deaths, in a family like hers -- which was professional and non-smoking -- was one in 73 million. To cut a long story short, she was convicted at the time. Later, and fairly recently, acquitted on appeal -- in fact, on the second appeal. And just to set it in context, you can imagine how awful it is for someone to have lost one child, and then two, if they're innocent, to be convicted of murdering them. To be put through the stress of the trial, convicted of murdering them -- and to spend time in a women's prison, where all the other prisoners think you killed your children -- is a really awful thing to happen to someone. And it happened in large part here because the expert got the statistics horribly wrong, in two different ways.

הנה דוגמא יותר אקטואלית לאותו הדבר. אם אתם מבריטניה אתם מכירים את המקרה המפורסם של אשה בשם סאלי קלארק, שהיו לה 2 תינוקות שמתו בפתאומיות. בהתחלה, חשבו שהם מתו ממה שידוע באופן לא פורמלי כ"מוות בעריסה", ובאופן יותר פורמלי כתסמונת המוות הפתאומי בינקות. מסיבות שונות, היא הורשעה מאוחר יותר ברצח. ובמשפט שלה, רופא ילדים מאוד מכובד העיד שהסיכוי לשני מקרים של מוות בעריסה, מקרי מוות תמימים, במשפחה כמו שלה, שהייתה מקצועית ולא מעשנת, הוא 1 ל-73 מיליון. אם נקצר סיפור ארוך, היא הורשעה באותו הזמן. מאוחר יותר, ודי לאחרונה, היא זוכתה לאחר ערעור -- למעשה, לאחר הערעור השני שלה. וכדי שתראו את כל התמונה, נסו לדמיין כמה זה נורא לאבד ילד אחד, ואז עוד אחד - כשאתה חף מפשע, ולהיות מורשע ברצח שלהם. לעבור את כל המתח של המשפט, להיות מורשע ברצח, ולהגיע לבית הסוהר לנשים, בו כל שאר האסירות חושבות שרצחת את הילדים שלך -- זה דבר נורא ביותר. וכאן, הוא התרחש בחלקו הגדול בגלל שתי טעויות חמורות של המומחה בחישובי הסטטיסטיקה שלו.

So where did he get the one in 73 million number? He looked at some research, which said the chance of one cot death in a family like Sally Clark's is about one in 8,500. So he said, "I'll assume that if you have one cot death in a family, the chance of a second child dying from cot death aren't changed." So that's what statisticians would call an assumption of independence. It's like saying, "If you toss a coin and get a head the first time, that won't affect the chance of getting a head the second time." So if you toss a coin twice, the chance of getting a head twice are a half -- that's the chance the first time -- times a half -- the chance a second time. So he said, "Here, I'll assume that these events are independent. When you multiply 8,500 together twice, you get about 73 million." And none of this was stated to the court as an assumption or presented to the jury that way. Unfortunately here -- and, really, regrettably -- first of all, in a situation like this you'd have to verify it empirically. And secondly, it's palpably false. There are lots and lots of things that we don't know about sudden infant deaths. It might well be that there are environmental factors that we're not aware of, and it's pretty likely to be the case that there are genetic factors we're not aware of. So if a family suffers from one cot death, you'd put them in a high-risk group. They've probably got these environmental risk factors and/or genetic risk factors we don't know about. And to argue, then, that the chance of a second death is as if you didn't know that information is really silly. It's worse than silly -- it's really bad science. Nonetheless, that's how it was presented, and at trial nobody even argued it. That's the first problem. The second problem is, what does the number of one in 73 million mean? So after Sally Clark was convicted -- you can imagine, it made rather a splash in the press -- one of the journalists from one of Britain's more reputable newspapers wrote that what the expert had said was, "The chance that she was innocent was one in 73 million." Now, that's a logical error. It's exactly the same logical error as the logical error of thinking that after the disease test, which is 99 percent accurate, the chance of having the disease is 99 percent. In the disease example, we had to bear in mind two things, one of which was the possibility that the test got it right or not. And the other one was the chance, a priori, that the person had the disease or not. It's exactly the same in this context. There are two things involved -- two parts to the explanation. We want to know how likely, or relatively how likely, two different explanations are. One of them is that Sally Clark was innocent -- which is, a priori, overwhelmingly likely -- most mothers don't kill their children. And the second part of the explanation is that she suffered an incredibly unlikely event. Not as unlikely as one in 73 million, but nonetheless rather unlikely. The other explanation is that she was guilty. Now, we probably think a priori that's unlikely. And we certainly should think in the context of a criminal trial that that's unlikely, because of the presumption of innocence. And then if she were trying to kill the children, she succeeded. So the chance that she's innocent isn't one in 73 million. We don't know what it is. It has to do with weighing up the strength of the other evidence against her and the statistical evidence. We know the children died. What matters is how likely or unlikely, relative to each other, the two explanations are. And they're both implausible. There's a situation where errors in statistics had really profound and really unfortunate consequences. In fact, there are two other women who were convicted on the basis of the evidence of this pediatrician, who have subsequently been released on appeal. Many cases were reviewed. And it's particularly topical because he's currently facing a disrepute charge at Britain's General Medical Council.

איך הוא הגיע למספר של 1 ל-73 מיליון? הוא מצא מחקר שאמר שהסיכוי למוות בעריסה אחד למשפחה כמו זו של סאלי קלארק הוא כ-1 ל-8,500. ואז הוא אמר, "אני מניח שאם יש מקרה אחד של מוות בעריסה במשפחה, הסיכוי שילד נוסף ימות ממוות בעריסה לא משתנה." זוהי מה שסטטיסטיקאים מכנים הנחת אי-תלות. זה כמו להגיד, "אם מטילים מטבע ומקבלים עץ, זה לא משפיע על הסיכוי לקבלת עץ בפעם השנייה." ולכן אם מטילים מטבע פעמיים, הסיכוי לקבל עץ פעמיים הוא חצי - הסיכוי לעץ בפעם הראשונה, כפול חצי - הסיכוי לעץ בפעם השנייה. אז הוא אמר, "אני אניח ששני המאורעות הם בלתי תלויים. כשמכפילים 8,500 ב-8,500, מקבלים בערך 73 מיליון." וההנחה הזאת לא הוצגה בפני בית המשפט או בפני חבר המושבעים בצורה הזאת. לרוע המזל כאן, ובאופן מצער ביותר - קודם כל, במצב כזה צריך לוודא את הנתונים באופן אמפירי. ודבר שני, זה בפירוש לא נכון. יש המון דברים לא ידועים על מוות בעריסה. יתכן מאוד שקיימים גורמים סביבתיים שאנחנו לא מודעים להם, וישנה סבירות גבוהה שמעורבים בכך גורמים גנטיים שאנחנו לא מודעים לקיומם. ולכן, אם במשפחה מתרחש מוות בעריסה, צריך להכניס אותה לקבוצה בסיכון גבוה. קרוב לודאי שיש לה גורמי סיכון סביבתיים ו/או גורמי סיכון גנטיים שאנחנו לא מכירים. ולכן, הטענה שהסיכוי למוות שני במשפחה זהה למקרה בו הנתונים לא ידועים - היא מטופשת ביותר. ויותר ממטופשת -- זהו מדע גרוע ביותר. ובכל זאת, זו הדרך בה העניין הוצג, ובבית המשפט אף אחד לא ניסה לטעון נגדו. זוהי הבעיה הראשונה. הבעיה השניה היא, מה המשמעות של 1 ל-73 מיליון? אחרי שסאלי קלארק הורשעה, אתם יכולים לתאר לעצמכם שהיה רעש גדול בתקשורת. אחד מהעיתונאים מאחד העיתונים המוערכים יותר בבריטניה כתב שמה שהמומחה אמר הוא, ש"הסיכוי שהיא חפה מפשע הוא 1 ל-73 מיליון." זוהי טעות לוגית. זוהי אותה טעות לוגית כמו הטעות לחשוב שאחרי הבדיקה למחלה, שמדוייקת ב-99 אחוז, הסיכוי לחלות הוא 99 אחוז. בדוגמא של המחלה, היינו צריכים לזכור שני דברים, הראשון הוא האפשרות שהבדיקה הייתה תקינה או לא תקינה. והשני הוא הסיכוי, א-פריורי, שהאדם חולה או בריא. זה בדיוק אותו דבר בהקשר הזה. יש שני דברים מעורבים -- שני חלקים להסבר. אנחנו רוצים לדעת מה הסיכוי, או מה הסיכוי היחסי של שני הסברים אפשריים. ההסבר הראשון הוא שסאלי קלארק חפה מפשע -- שלו יש, א-פריורי, סיכוי גבוה מאוד - רוב האמהות לא הורגות את הילדים שלהן. החלק השני של ההסבר הוא שהיא סבלה ממקרה עם סבירות מאוד נמוכה. לא סיכוי נמוך כמו 1 ל-73 מיליון, אבל בכל זאת בסבירות די נמוכה. ההסבר השני הוא שהיא אשמה. קרוב לודאי שאנחנו חושבים מלכתחילה שזה לא סביר. ודאי שאנחנו צריכים לחשוב בהקשר של משפט פלילי שיש לכך סיכוי נמוך, בגלל הנחת החפות מפשע. ושאם היא ניסתה להרוג את הילדים, היא הצליחה. אם כך, הסיכוי שהיא חפה מפשע אינו 1 ל-73 מיליון. אנחנו לא יודעים מה הוא. צריך לקחת בחשבון את חוזק הראיות האחרות נגדה ואת הראיות הסטטיסטיות. אנחנו יודעים שהילדים מתו. מה שחשוב זה מה הסבירות או אי-הסבירות של שני ההסברים ביחס זה לזה. ושניהם בלתי סבירים. זהו מקרה בו ההשלכות של טעויות בסטטיסטיקה חמורות ביותר ומצערות ביותר. למעשה, שתי נשים נוספות הורשעו על בסיס העדות של רופא הילדים הזה, ושוחררו לאחר ערעור. מקרים רבים נבחנו מחדש. וזה מאוד אקטואלי כי עכשיו הוא עומד בפני תביעת הוצאת שם רע במועצה הרפואית הכללית בבריטניה.

So just to conclude -- what are the take-home messages from this? Well, we know that randomness and uncertainty and chance are very much a part of our everyday life. It's also true -- and, although, you, as a collective, are very special in many ways, you're completely typical in not getting the examples I gave right. It's very well documented that people get things wrong. They make errors of logic in reasoning with uncertainty. We can cope with the subtleties of language brilliantly -- and there are interesting evolutionary questions about how we got here. We are not good at reasoning with uncertainty. That's an issue in our everyday lives. As you've heard from many of the talks, statistics underpins an enormous amount of research in science -- in social science, in medicine and indeed, quite a lot of industry. All of quality control, which has had a major impact on industrial processing, is underpinned by statistics. It's something we're bad at doing. At the very least, we should recognize that, and we tend not to. To go back to the legal context, at the Sally Clark trial all of the lawyers just accepted what the expert said. So if a pediatrician had come out and said to a jury, "I know how to build bridges. I've built one down the road. Please drive your car home over it," they would have said, "Well, pediatricians don't know how to build bridges. That's what engineers do." On the other hand, he came out and effectively said, or implied, "I know how to reason with uncertainty. I know how to do statistics." And everyone said, "Well, that's fine. He's an expert." So we need to understand where our competence is and isn't. Exactly the same kinds of issues arose in the early days of DNA profiling, when scientists, and lawyers and in some cases judges, routinely misrepresented evidence. Usually -- one hopes -- innocently, but misrepresented evidence. Forensic scientists said, "The chance that this guy's innocent is one in three million." Even if you believe the number, just like the 73 million to one, that's not what it meant. And there have been celebrated appeal cases in Britain and elsewhere because of that.

וכדי לסכם - מה המסר שתקחו אתכם הביתה? אנחנו יודעים שאקראיות, וחוסר ודאות, וסיכוי הם חלק בלתי נפרד מחיי היום יום שלנו. בנוסף, למרות שאתם, ככלל, מיוחדים מאוד בדרכים רבות, אתם אופייניים מאוד שלא פתרתם את הדוגמאות שלי בצורה נכונה. ישנו תעוד רחב היקף לכך שאנשים טועים. הם מבצעים שגיאות לוגיות בזמן הסקת מסקנות בתנאי חוסר ודאות. אנחנו יכולים להתמודד עם הדקויות של השפה באופן מזהיר -- וישנן שאלות אבולוציוניות מעניינות על איך הגענו לכאן. אנחנו לא מוצלחים בהסקת מסקנות בתנאי חוסר ודאות. זהו נושא שקיים בחיי היום יום שלנו. כפי ששמעתם בהרצאות רבות, הסטטיסטיקה מהווה בסיס לכמות עצומה של מחקרים מדעיים -- במדעי החברה, ברפואה, ולמעשה, בתחומים תעשייתיים רבים. כל בקרת האיכות, שהיא בעלת השפעה מכרעת על תהליכים תעשייתיים, מבוססת על סטטיסטיקה. וזה משהו שאנחנו גרועים בביצוע שלו. לכל הפחות, עלינו להכיר בכך, ואנחנו נוטים לא לעשות זאת. ואם נחזור להקשר המשפטי, במשפט של סאלי קלארק כל עורכי הדין פשוט קיבלו את מה שהמומחה אמר. וכך, אם רופא ילדים היה אומר לחבר המושבעים, "אני יודע לבנות גשרים. בניתי אחד במורד הדרך. אני מבקש שתסעו עליו בדרככם הביתה," הם היו אומרים, "ובכן, רופאי ילדים לא יודעים לבנות גשרים. זהו תפקידם של המהנדסים." ומצד שני, השתמע מדבריו שהוא אומר, "אני יודע איך להסיק מסקנות בתנאי חוסר ודאות. אני יודע איך עושים סטטיסטיקה. וכולם אמרו, "בסדר גמור. הוא מומחה." אנחנו חייבים להבין באיזה תחומים יש לנו יכולת ובאיזה לא. אותם נושאים התעוררו בתחילת הדרך של שימוש בפרופילים גנטיים, כאשר מדענים ועורכי דין ובמקרים מסויימים שופטים, נהגו להציג ראיות בצורה מסולפת. בדרך כלל, יש לקוות, בתום לב, אבל הראיות הוצגו בצורה מסולפת. מדעני זיהוי פלילי אמרו, "הסיכוי שהבחור הזה חף מפשע הוא 1 ל-3 מיליון. וגם אם אתם מאמינים למספר הזה, כמו ל-73 מיליון ל-1, זאת לא המשמעות שלו. והיו ערעורים מפורסמים בבריטניה ובמקומות אחרים מהסיבה הזאת.

And just to finish in the context of the legal system. It's all very well to say, "Let's do our best to present the evidence." But more and more, in cases of DNA profiling -- this is another one -- we expect juries, who are ordinary people -- and it's documented they're very bad at this -- we expect juries to be able to cope with the sorts of reasoning that goes on. In other spheres of life, if people argued -- well, except possibly for politics -- but in other spheres of life, if people argued illogically, we'd say that's not a good thing. We sort of expect it of politicians and don't hope for much more. In the case of uncertainty, we get it wrong all the time -- and at the very least, we should be aware of that, and ideally, we might try and do something about it. Thanks very much.

וכדי לסיים בהקשר של המערכת המשפטית. זה יפה מאוד להגיד, "נעשה את מיטב יכולתנו בהצגת הראיות." אבל יותר ויותר, במקרים של בניית פרופילים גנטיים -- זה דבר נוסף -- אנחנו מצפים מחבר המושבעים, שהם אנשים רגילים -- וזה מתועד שהם מאוד גרועים בכך -- אנחנו מצפים שחבר המושבעים יהיה מסוגל להתמודד עם תהליך הסקת המסקנות שכרוך בכך. בתחומי חיים אחרים, אם אנשים היו טוענים -- חוץ אולי מבפוליטיקה, אבל בתחומי חיים אחרים, אם אנשים היו טוענים בחוסר הגיון, היינו אומרים שזה דבר גרוע. אולי אנחנו מצפים לכך מפוליטיקאים - הציפיות שלנו מהם לא גבוהות. במקרים של חוסר ודאות, אנחנו טועים כל הזמן -- ולכל הפחות, אנחנו צריכים להיות מודעים לכך. ובאופן אידיאלי, גם לנסות לעשות משהו בקשר לכך. תודה רבה.

Peter Donnelly: How juries are fooled by statistics

Peter Donnelly: How juries are fooled by statistics

Related talks

Hans Rosling: The best stats you've ever seen

Michael Shermer: Why people believe weird things

Emily Oster: Flip your thinking on AIDS in Africa

Robert Full: Learning from the gecko's tail

Aubrey de Grey: A roadmap to end aging

E.O. Wilson: Advice to a young scientist

Related talks

Hans Rosling: The best stats you've ever seen

Michael Shermer: Why people believe weird things

Emily Oster: Flip your thinking on AIDS in Africa

Robert Full: Learning from the gecko's tail

Aubrey de Grey: A roadmap to end aging

E.O. Wilson: Advice to a young scientist