Nicholas Christakis: How social networks predict epidemics

For the last 10 years, I've been spending my time trying to figure out how and why human beings assemble themselves into social networks. And the kind of social network I'm talking about is not the recent online variety, but rather, the kind of social networks that human beings have been assembling for hundreds of thousands of years, ever since we emerged from the African savannah. So, I form friendships and co-worker and sibling and relative relationships with other people who in turn have similar relationships with other people. And this spreads on out endlessly into a distance. And you get a network that looks like this. Every dot is a person. Every line between them is a relationship between two people -- different kinds of relationships. And you can get this kind of vast fabric of humanity, in which we're all embedded.

در ۱۰ سال گذشته ، من وقتم را صرف پیدا کردن اینکه چطور و چرا انسانها از خودشان شبکه‌های اجتماعی می‌‌سازند. شبکه اجتماعی که من در مورد آن صحبت میکنم از این شبکه‌های اجتماعی که جدیدا در اینترنت پیدا شده نیست بلکه، نوعی از شبکه اجتماعی است که انسانها از صدها هزار سال پیش در اطراف خود درست میکردند از زمانی‌ که ما از قارهٔ آفریقا جدا شدیم. بنابراین من روابط بین دوستان ، همکاران هم نژاد ها و خویشاوندان با سایر افراد را ترسیم میکنم. که هر یک از این افراد روابط مشابه ای با دیگران دارند و این روابط به صورت نامحدود در مسیری گسترش پیدا می کنند و در نهایت شما شبکه ای نظیر این را میبینید هر نقطه‌ یک فرد هست و هر خط بین آن ها نمایانگر روابط بین دو فرد است انواع روابط مختلف و در نهایت جامعه وسیعی از انسان ها به این صورت خواهیم داشت که همه ما در آن حضور داریم.

And my colleague, James Fowler and I have been studying for quite sometime what are the mathematical, social, biological and psychological rules that govern how these networks are assembled and what are the similar rules that govern how they operate, how they affect our lives. But recently, we've been wondering whether it might be possible to take advantage of this insight, to actually find ways to improve the world, to do something better, to actually fix things, not just understand things. So one of the first things we thought we would tackle would be how we go about predicting epidemics.

من و همکارم، جیمز فلر برای مدتی است که در این رابطه مطالعه میکنیم تا ببینیم قوانین محاسباتی، اجتماعی بیولوژی و روانشناختی حاکم بر این شبکه ها چه گونه شکل گرفته و قوانین مشترک برای شکل گیری این شبکه ها چیست و چگونه بر زندگی ما تاثیر میگذارند اما اخیرا به این فکر افتادیم که ببینیم چطور می شود از این اطلاعات طور دیگری استفاده کرد، و راهی برای بهتر شدن دنیا پیدا کرد. کارهای بهتر انجام داد تا در عمل بتوان مسائل را حل کرد، نه اینکه فقط مسائل را درک کرد بنابراین اولین چیزی که فکر کردیم میتوان از عهده اش بر آمد مسئله چگونگی پیش بینی بیماری های همه گیر بود

And the current state of the art in predicting an epidemic -- if you're the CDC or some other national body -- is to sit in the middle where you are and collect data from physicians and laboratories in the field that report the prevalence or the incidence of certain conditions. So, so and so patients have been diagnosed with something, or other patients have been diagnosed, and all these data are fed into a central repository, with some delay. And if everything goes smoothly, one to two weeks from now you'll know where the epidemic was today. And actually, about a year or so ago, there was this promulgation of the idea of Google Flu Trends, with respect to the flu, where by looking at people's searching behavior today, we could know where the flu -- what the status of the epidemic was today, what's the prevalence of the epidemic today.

در حال حاضر وضعیت پیش بینی بیماری های همه گیر اگر عضو یک مرکز کنترل دارو و یا یکی از افراد ملی باشید به گونه ای است که باید منتظربمانید و اطلاعات جمع کنید از پزشکان و آزمایشگاه ها ی مرتبط که شیوع یا بروز مسئله خاصی را گزارش میکنند تعداد بسیار زیادی به بیماری خاصی تشخیص داده می شوند یا بیماران دیگری تشخیص داده می شوند و همه این اطلاعات با کمی تاخیر در یک مکان جمع آوری می شوند و اگر همه چیز خوب پیش برود یک تا دوهفته بعد تازه می فهمید که بیماری همه گیر از کجا شروع شده حدودا یک سال پیش بود که ایده ای رواج یافته بود در رابطه با {روند آنفولانزای گوگل}، که با توجه به آنفولانزا و نحوه جست و جوی مردم میتوانیم بفهمیم که آنفولانزا کجا بوده و وضعیت همه گیری امروز چگونه بوده است و شیوع این همه گیری به چه شکل بوده

But what I'd like to show you today is a means by which we might get not just rapid warning about an epidemic, but also actually early detection of an epidemic. And, in fact, this idea can be used not just to predict epidemics of germs, but also to predict epidemics of all sorts of kinds. For example, anything that spreads by a form of social contagion could be understood in this way, from abstract ideas on the left like patriotism, or altruism, or religion to practices like dieting behavior, or book purchasing, or drinking, or bicycle-helmet [and] other safety practices, or products that people might buy, purchases of electronic goods, anything in which there's kind of an interpersonal spread. A kind of a diffusion of innovation could be understood and predicted by the mechanism I'm going to show you now.

اما چیزی که من امروز قرار است به شما نشان دهم ابزاری است که از طریق آن نه تنها خیلی سریع در رابطه با یک بیماری همه گیر هشداردریافت میکنیم بلکه در حقیقت خیلی زود میتوان این بیماری همه گیر را شناسایی کرد و در واقع این ایده فقط برای پیش بینی همه گیری میکروبها مناسب نیست بلکه برای انواع همه گیری میتوان از آن استفاده کرد برای مثال هر چیزی که از طریق اجتماع شیوع پیدا می کند میتواند از این طریق شناخته شود از موضوعات انتزاعی مانند وطن دوستی، نوع دوستی، یا مذهب گرفته تا کارهایی مثل رفتار غذایی، خرید کتاب نوشیدن، کلاه ایمنی دوچرخه سواری و سایر کارهای ایمنی یا محصولاتی که مردم ممکن است بخرند خرید لوازم الکترونیکی یا هرچیزی که در آن نوعی گسترش بین فردی وجود دارد نوعی انتشار نوع آوری که همه این ها با استفاده از مکانیزمی که توضیح خواهم داد قابل درک و قابل پیش بینی هستند

So, as all of you probably know, the classic way of thinking about this is the diffusion-of-innovation, or the adoption curve. So here on the Y-axis, we have the percent of the people affected, and on the X-axis, we have time. And at the very beginning, not too many people are affected, and you get this classic sigmoidal, or S-shaped, curve. And the reason for this shape is that at the very beginning, let's say one or two people are infected, or affected by the thing and then they affect, or infect, two people, who in turn affect four, eight, 16 and so forth, and you get the epidemic growth phase of the curve. And eventually, you saturate the population. There are fewer and fewer people who are still available that you might infect, and then you get the plateau of the curve, and you get this classic sigmoidal curve. And this holds for germs, ideas, product adoption, behaviors, and the like. But things don't just diffuse in human populations at random. They actually diffuse through networks. Because, as I said, we live our lives in networks, and these networks have a particular kind of a structure.

پس از آنجایی که احتمالا همه شما میدانید شیوه کلاسیک فکر در رابطه با این موضوع را انتشار نوآوری یا منحنی تصویب می نامند در محور عمودی افرادی که مبتلا به بیماری شده اند نمایش داده شده و در محور افقی زمان و در همان ابتدا افراد زیادی به بیماری آلوده نشده اند و در اینجا یک منحنی هلالی شکل شبیه به حرف S داریم و دلیل به وجود آمدن این شکل این است که در ابتدا مثلا زمانی که یک یا دو نفر آلوده یا مبتلا به بیماری شده اند دو نفر دیگر را بیمار و مبتلا می کنند و در عوض این دو نفر چهار، هشت، شانزده و همین طور افراد دیگر را مبتلا میسازند و ما به مرحله رشد همه گیری منحنی خواهیم رسید و در نهایت یک جمعیت آلوده به بیماری می شود. با گذر زمان تعداد افرادی که قابل مبتلا کردن هستند کم تر و کم تر میشود تا به پایان منحنی نزدیک می شویم و به این منحنی هلالی شکل کلاسیک می رسیم و این منحنی برای میکروبها، ایده ها پذیرش محصول، رفتار و چیزهایی شبیه به این هم صادق است. اما این چیزها به صورت تصادفی بین جمعیت ها انتشار پیدا نمیکنند آن ها از طریق شبکه ها انتشار پیدا می کنند زیرا همان طور که گفتم ما درون شبکه ها زندگی میکنیم و این شبکه ها ویژگی های خاصی دارند

Now if you look at a network like this -- this is 105 people. And the lines represent -- the dots are the people, and the lines represent friendship relationships. You might see that people occupy different locations within the network. And there are different kinds of relationships between the people. You could have friendship relationships, sibling relationships, spousal relationships, co-worker relationships, neighbor relationships and the like. And different sorts of things spread across different sorts of ties. For instance, sexually transmitted diseases will spread across sexual ties. Or, for instance, people's smoking behavior might be influenced by their friends. Or their altruistic or their charitable giving behavior might be influenced by their coworkers, or by their neighbors. But not all positions in the network are the same.

حالا اگر به شبکه ای مثل این نگاه کنید ۱۵۰ نفر میبینید و یک سری خطوط... نقطه ها افراد هستند و خطوط نماینگر روابط دوستانه ممکن است ببنید بعضی از افراد نقاط مختلفی از شبکه را اشغال کردند روابط مختلفی بین افراد وجود دارد در اینجا روابط دوستی، خویشاوندی روباط زناشوهری، روابط همکاری روابط همسایه ای و شبیه به آن وجود دارد و چیزهای مختلفی که از طریق گره های مختلفی گسترش می یابد به عنوان مثال، بیماریهایی که از طریق روابط جنسی منتقل می شوند از طریق گره های روابط جنسی منتقل خواهند شد یا مثلا رفتار سیگار کشیدن افراد ممکن است از دوستانشان تاثیر گرفته باشد و یا رفتار هم نوع دوستی و کمک به خیریه از همکارانشان تاثیر گرفته باشد یا همسایه هایشان اما همه جایگاه های شبکه مثل هم نیستند

So if you look at this, you might immediately grasp that different people have different numbers of connections. Some people have one connection, some have two, some have six, some have 10 connections. And this is called the "degree" of a node, or the number of connections that a node has. But in addition, there's something else. So, if you look at nodes A and B, they both have six connections. But if you can see this image [of the network] from a bird's eye view, you can appreciate that there's something very different about nodes A and B. So, let me ask you this -- I can cultivate this intuition by asking a question -- who would you rather be if a deadly germ was spreading through the network, A or B? (Audience: B.) Nicholas Christakis: B, it's obvious. B is located on the edge of the network. Now, who would you rather be if a juicy piece of gossip were spreading through the network? A. And you have an immediate appreciation that A is going to be more likely to get the thing that's spreading and to get it sooner by virtue of their structural location within the network. A, in fact, is more central, and this can be formalized mathematically. So, if we want to track something that was spreading through a network, what we ideally would like to do is to set up sensors on the central individuals within the network, including node A, monitor those people that are right there in the middle of the network, and somehow get an early detection of whatever it is that is spreading through the network.

پس اگر به این نگاه کنید سریع متوجه می شوید که تعداد روابط افراد با یکدیگر متفاوت است. برخی افراد با یک نفر در ارتباط هستند برخی دیگر با دو نفر برخی با ۶ نفر و دیگری با ۱۰ نفر و این تعداد روابط را " درجه گره " یا " تعداد روابط گره " می گویند. اما نکته دیگری هم وجود دارد اگر به گره های A و B نگاه کنید هر دو با ۶ نفر در ارتباط هستند اما اگر به این عکس شبکه به طور کلی از بالا نگاه کنید احساس میکنید چیزی متفاوت در رابطه با نقطه A و B وجود دارد. پس بگذارید یک سوال بپرسم. با این سوال موضوع را بهتر میتوانم منتقل کنم. ترجیح می دهید کدام یک از این افراد باشید؟ اگر یک میکروب کشنده در حال گسترش در این شبکه بود؟ A یا B شنونده ها: B نیکولاس چریستکیس: بله B، کاملا واضح است. B در گوشه این شبکه قرار دارد. حالا ترجیح می دهید کدام یک از این افراد باشید؟ وقتی یک شایعه داغ در حال گسترش در شبکه است. A. شما نزدیک ترین فردی هستید که از او قدردانی می شود. احتمال شنیدن خبر توسط A بیش تر از بقیه است و خیلی زود متوجه خبر ها در شبکه می شود به علت جای ساختاری اش درون شبکه A در واقع فردی مرکزی است. و این می تواند فرمول ریاضی داشته باشد. بنابراین اگر بخواهیم رد چیزی را بگیریم که در شبکه در حال گسترش است ایده الش این است که یک سری گیرنده به افراد مرکزی شبکه متصل کنیم. همچنین گره A نظارت بر افرادی که در مرکز شبکه قرار دارند و تا حدودی ردیابی به موقع هر چیزی که درون شبکه در حال گسترش است.

So if you saw them contract a germ or a piece of information, you would know that, soon enough, everybody was about to contract this germ or this piece of information. And this would be much better than monitoring six randomly chosen people, without reference to the structure of the population. And in fact, if you could do that, what you would see is something like this. On the left-hand panel, again, we have the S-shaped curve of adoption. In the dotted red line, we show what the adoption would be in the random people, and in the left-hand line, shifted to the left, we show what the adoption would be in the central individuals within the network. On the Y-axis is the cumulative instances of contagion, and on the X-axis is the time. And on the right-hand side, we show the same data, but here with daily incidence. And what we show here is -- like, here -- very few people are affected, more and more and more and up to here, and here's the peak of the epidemic. But shifted to the left is what's occurring in the central individuals. And this difference in time between the two is the early detection, the early warning we can get, about an impending epidemic in the human population.

بنابراین اگر دیدید که این افراد میکروب یا اطلاعاتی را دریافت کردند، متوجه خواهید شد که به زودی همه افراد میکروب یا اطلاعات را دریافت خواهند کرد و مسلما این کار خیلی بهتر از این است که روی ۶ نفر به صورت تصادفی و بدون رجوع به ساختار جمعیت نظارت داشته باشیم. نظارت داشته باشیم چیزی که خواهید دید شبیه به این خواهد بود در پنل سمت چپ دوباره منحنی جذبی شبیه به حرف S داریم خط قرمز نقطه نقطه نشان می دهد که جذب در انتخاب تصادفی افراد چگونه شکل می گیرد و خط سمت راست که به سمت چپ حرکت کرده نشان می دهد که جذب در مرکز افراد داخل شبکه چگونه صورت میگیرد روی محور عمودی تجمع افرادی است که مورد سرایت قرار گرفته اند و محور افقی زمان است و در سمت راست اطلاعات مشابه ای میبینید اما در اینجا همراه شیوع روزانه و چیزی که اینجا میبینید مثل اینجاست تعداد کمی مبتلا شده اند و بعد زیاد و زیادتر می شوند تا به اینجا می رسند و اینجا نقطه اوج همه گیری است اما سمت چپ اتفاقی است که برای افراد مرکزی می افتد و این تفاوت زمان بین این دو شناسایی و هشدار سریع تر را در رابطه با همه گیری احتمالی در یک جمعیت انسانی را برای ما مهیا می سازد.

The problem, however, is that mapping human social networks is not always possible. It can be expensive, not feasible, unethical, or, frankly, just not possible to do such a thing. So, how can we figure out who the central people are in a network without actually mapping the network? What we came up with was an idea to exploit an old fact, or a known fact, about social networks, which goes like this: Do you know that your friends have more friends than you do? Your friends have more friends than you do, and this is known as the friendship paradox. Imagine a very popular person in the social network -- like a party host who has hundreds of friends -- and a misanthrope who has just one friend, and you pick someone at random from the population; they were much more likely to know the party host. And if they nominate the party host as their friend, that party host has a hundred friends, therefore, has more friends than they do. And this, in essence, is what's known as the friendship paradox. The friends of randomly chosen people have higher degree, and are more central than the random people themselves.

با این وجود مشکل اینجاست که ترسیم نقشه شبکه همیشه امکان پذیر نیست این کار بسیار پر هزینه است و عملی نیست. غیر اخلاقی است یا اگر بخواهم رک بگویم اصلا امکان پذیر نیست. بنابراین چگونه می توانیم بفهمیم که چه کسانی در مرکز شبکه قرار دارند آن هم بدون ترسیم نقشه شبکه نتیجه ای که ما گرفتیم این بود که از واقعیتی قدیمی و حقیقتی شناخته شده در رابطه با شبکه های اجتماعی استفاده کنیم که این گونه مطرح می شود آیا میدانستید که دوستانتان از شما دوستان بیش تری دارند؟ دوستانتان از شما دوستان بیش تری دارند، این به عنوان پارادکس دوستی شناخته می شود فردی بسیار اجتماعی را در یک شبکه اجتماعی در نظر بگیرید مثل یک میزبانی که صد نفر مهمان دارد و یک مردم گریز که فقط یک دوست دارد و شما یک فرد را به صورت تصادفی از جمعیت انتخاب میکنید احتمال اینکه میزبان را بشناسند خیلی بیش تر است و اگر میزبان مهمانی را به عنوان دوست خود انتخاب کنند آن میزبان صد دوست خواهد داشت بنابراین او دوستان بیش تری نسبت به سایر افراد خواهد داشت و در اصل این چیزی است که به عنوان پارادکس دوستی شناخته میشود دوستان افرادی که به طور تصادفی انتخاب شده اند درجه بیش تری دارند و نسبت به خود افرادی که تصادفی انتخاب شده اند مرکزی تر هستند

And you can get an intuitive appreciation for this if you imagine just the people at the perimeter of the network. If you pick this person, the only friend they have to nominate is this person, who, by construction, must have at least two and typically more friends. And that happens at every peripheral node. And in fact, it happens throughout the network as you move in, everyone you pick, when they nominate a random -- when a random person nominates a friend of theirs, you move closer to the center of the network. So, we thought we would exploit this idea in order to study whether we could predict phenomena within networks. Because now, with this idea we can take a random sample of people, have them nominate their friends, those friends would be more central, and we could do this without having to map the network.

اگر تنها افراد پیرامون شبکه را تصور کنید میتوانید درک بصری نسبت به این موضوع داشته باشید اگر این فرد را انتخاب کنید تنها فردی که او می تواند به عنوان دوستش انتخاب کند این فرد است که از نظر ساختاری باید حداقل به طور معمول دو دوست یا تعداد بیش تری دوست داشته باشد و این روند پیرامون هر گره رخ می دهد و در واقع این روند در تمام شبکه ها رخ می دهد. وقتی داخل شبکه می شوید هر فردی که انتخاب میکنید وقتی یک فرد تصادفی، یکی از دوستانش را انتخاب میکند به مرکز شبکه نزدیک تر می شوید بنابراین فکر کردیم می توانیم این ایده را به کار گیریم تا ببینیم آیا می شود پدیده ها را داخل شبکه ها پیش بینی کرد زیرا الان با این ایده می توانیم یک نمونه تصادفی از مردم را انتخاب کنیم و از آن ها بخواهیم که دوستانشان را مشخص کنند دوستانشان مرکزی تر خواهند بود و می توانیم این کار را بدون ترسیم نقشه شبکه انجام دهیم

And we tested this idea with an outbreak of H1N1 flu at Harvard College in the fall and winter of 2009, just a few months ago. We took 1,300 randomly selected undergraduates, we had them nominate their friends, and we followed both the random students and their friends daily in time to see whether or not they had the flu epidemic. And we did this passively by looking at whether or not they'd gone to university health services. And also, we had them [actively] email us a couple of times a week. Exactly what we predicted happened. So the random group is in the red line. The epidemic in the friends group has shifted to the left, over here. And the difference in the two is 16 days. By monitoring the friends group, we could get 16 days advance warning of an impending epidemic in this human population.

این ایده را در مورد شیوع آنفولانزای H1N1 در یکی از دانشکده های هاروارد مورد آزمایش قرار دادیم. زمستان سال ۲۰۰۹ یعنی چند ماه پیش هزارو سیصد دانشجو را به صورت تصادفی انتخاب کردیم از آن ها خواستیم تا دوستانشان را معرفی کنند و هر روز هم افراد تصادفی و هم دوستانشان را دنبال کردیم. تا ببینیم که آیا مبتلا به همه گیری آنفولانزا شده اند یا خیر. و با توجه به اینکه آیا دانشجویان به خدمات درمانی دانشگاه مراجعه کرده بودند یا خیر این کار را به صورت غیر مستقیم انجام دادیم و همچنین از آن ها خواستیم چند بار در هفته تا به صورت مستقیم از طریق ایمیل با ما در ارتباط باشند درست همان چیزی اتفاق افتاد که انتظار داشتیم. گروه تصادفی روی خط قرمز نشان داده شده و همه گیری در گروه دوستان به سمت چپ حرکت کرده و تفاوت بین این دو شانزده روز است با مشاهده گروه دوستان میتوانیم شانزده روز زودتر از یک همه گیری که در شرف وقوع در یک جمعیت انسانی است با خبر شویم.

Now, in addition to that, if you were an analyst who was trying to study an epidemic or to predict the adoption of a product, for example, what you could do is you could pick a random sample of the population, also have them nominate their friends and follow the friends and follow both the randoms and the friends. Among the friends, the first evidence you saw of a blip above zero in adoption of the innovation, for example, would be evidence of an impending epidemic. Or you could see the first time the two curves diverged, as shown on the left. When did the randoms -- when did the friends take off and leave the randoms, and [when did] their curve start shifting? And that, as indicated by the white line, occurred 46 days before the peak of the epidemic. So this would be a technique whereby we could get more than a month-and-a-half warning about a flu epidemic in a particular population.

علاوه بر این اگر شما یک تحلیل گری بودید که در مورد همه گیری مطالعه می کردید یا حتی در مورد پیش بینی جذب یک محصول تحقیق می کردید کاری که می توانید انجام دهید این است که یک نمونه تصادفی از جمعیت را انتخاب کنید و از آن ها بخواهید که دوستانشان را معرفی کنند و سپس آن ها را دنبال کنید هم افراد تصادفی را دنبال کنید و هم دوستانشان را در بین دوستان اولین مدرک بالای صفری که روی صفحه نمودار مشاهده کردید مثلا در جذب نوآوری مدرکی است از یک همه گیری قریب الوقوع یا اولین باری که دیدید دو منحنی از هم دور می شوند همان طور که سمت چپ نشان داده شده زمانی که نمونه های تصادفی -- زمانی که دوستان افزایش می بایند و از نمونه تصادفی بالاتر می روند و منحنی آن ها شروع به حرکت میکند که اینجا با خط سفید نشان داده شده چهل و شش روز قبل از اینکه به اوج همه گیری برسیم این اتفاق افتاده پس این تکنیکی است که توسط آن می توانیم از شیوع یک آنفولانزای همه گیر در یک جمعیت جلوگیری کنیم

I should say that how far advanced a notice one might get about something depends on a host of factors. It could depend on the nature of the pathogen -- different pathogens, using this technique, you'd get different warning -- or other phenomena that are spreading, or frankly, on the structure of the human network. Now in our case, although it wasn't necessary, we could also actually map the network of the students.

باید بگویم که میزان پیشرفته بودن اخطاری که یک نفر در مورد چیزی می تواند دریافت کند بستگی به عوامل زیادی دارد. ممکن است با توجه به نوع بیماری بیماری های متفاوت با استفاده از این تکنیک هشدار های متفاوتی اعلام کنند، یا حتی اتفاقات دیگری که در حال وقوع است یا اگر بخواهمم رک بگویم در ساختار یک شبکه انسانی. حالا در مورد ما با وجود اینکه لازم نبود اما نقشه شبکه دانشجویان را ترسیم کردیم.

So, this is a map of 714 students and their friendship ties. And in a minute now, I'm going to put this map into motion. We're going to take daily cuts through the network for 120 days. The red dots are going to be cases of the flu, and the yellow dots are going to be friends of the people with the flu. And the size of the dots is going to be proportional to how many of their friends have the flu. So bigger dots mean more of your friends have the flu. And if you look at this image -- here we are now in September the 13th -- you're going to see a few cases light up. You're going to see kind of blooming of the flu in the middle. Here we are on October the 19th. The slope of the epidemic curve is approaching now, in November. Bang, bang, bang, bang, bang -- you're going to see lots of blooming in the middle, and then you're going to see a sort of leveling off, fewer and fewer cases towards the end of December. And this type of a visualization can show that epidemics like this take root and affect central individuals first, before they affect others.

این نقشه ۷۱۴ دانشجو و گره های دوستانشان است. حالا در یک دقیقه میخواهم این نقشه را به حرکت در آورم و تغییرات روزانه شبکه را در ۱۲۰ روز نشان دهم. نقاط قرمز نمایانگر افرادی مبتلا به آنفولانزا هستند و نقاط زرد نمایانگر دوستان افرادی است که مبتلا به آنفولانزا هستند سایز نقطه ها متناسب با تعداد دوستانی است که از یک نفر آنفولانزا گرفته اند. پس هر چه نقطه بزرگتر باشد نشان دهنده این است که تعداد بیش تری از دوستانتان به آنفولاتزا مبتلا هستند. و اگر به این تصویر نگاه کنید، اینجا در سیزدهم سپتامبر هستیم تعداد کسانی که مبتلا به آنفولانزا هستند کم است اینجا گسترش آنفولانزا را در میانه شبکه میبینید اینجا ۱۹ اکتبر است. در نوامبر به شیب منحنی همه گیری نزدیک می شویم. بنگ بنگ بنگ ، حالا تعداد زیادی نقطه در مرکز شبکه میبینید و در اینحا مقداری کاهش در شیوع آنفولانزا میبینید و تا پایان دسامبر روز به روز تعداد کم تر می شود و این نوع تصویر سازی نشان می دهد که این نوع همه گیری ها مرکز را نشانه میگیرند و افراد مرکزی را قبل از دیگران تحت تاثیر قرار می دهند

Now, as I've been suggesting, this method is not restricted to germs, but actually to anything that spreads in populations. Information spreads in populations, norms can spread in populations, behaviors can spread in populations. And by behaviors, I can mean things like criminal behavior, or voting behavior, or health care behavior, like smoking, or vaccination, or product adoption, or other kinds of behaviors that relate to interpersonal influence. If I'm likely to do something that affects others around me, this technique can get early warning or early detection about the adoption within the population. The key thing is that for it to work, there has to be interpersonal influence. It cannot be because of some broadcast mechanism affecting everyone uniformly.

حالا همان طور که پیشنهاد دادم این روش تنها به میکروب ها محدود نمیشود. و روی هر چیزی که در یک جمعیت پخش می شود تاثیر گذار است اطلاعات در جمعیت پخش می شود هنجار ها در جمعیت پخش می شوند رفتار ها در جمعیت پخش می شوند و منظور من از رفتار می تواند چیزهایی مثل اخلاق مجرمانه رفتار انتخاباتی، یا رفتار های مربوط به سلامتی مثل سیگار کشیدن یا واکسیناسیون جذب محصول یا هر نوع رفتار دیگری که مربوط به تاثیرات درون فردی می شود. اگر من قرار است کاری انجام دهم که افراد کنار من را تحت تاثیر قرار دهد این تکنیک می تواند زودتر هشدار دهد و جذب در یک جکعیت را زودتر شناسایی کند نکته اصلی برای این کار این است که باید تاثیرات درون فردی حتما وجود داشته باشد صرفا با یک مکانیزم انتشاری نمی توان همه را به طور یکسان تحت تاثیر قرار داد

Now the same insights can also be exploited -- with respect to networks -- can also be exploited in other ways, for example, in the use of targeting specific people for interventions. So, for example, most of you are probably familiar with the notion of herd immunity. So, if we have a population of a thousand people, and we want to make the population immune to a pathogen, we don't have to immunize every single person. If we immunize 960 of them, it's as if we had immunized a hundred [percent] of them. Because even if one or two of the non-immune people gets infected, there's no one for them to infect. They are surrounded by immunized people. So 96 percent is as good as 100 percent. Well, some other scientists have estimated what would happen if you took a 30 percent random sample of these 1000 people, 300 people and immunized them. Would you get any population-level immunity? And the answer is no. But if you took this 30 percent, these 300 people and had them nominate their friends and took the same number of vaccine doses and vaccinated the friends of the 300 -- the 300 friends -- you can get the same level of herd immunity as if you had vaccinated 96 percent of the population at a much greater efficiency, with a strict budget constraint.

حالا برداشتی مشابه با توجه به شبکه می توان داشت البته برداشتی متفاوت نیز می توان داشت. برای مثال،در مورد مداخله های گروهی برای تغییر یک نفر خاص برای مثال اکثر شما با شعار ایمنی جامعه آشنا هستید و اگر بخواهیم این جمعیت را از یک بیماری محفوظ کنیم لازم نیست که همه افراد را در برابر بیماری مقاوم سازیم اگر ۹۶۰ نفر را واکسینه کنیم انگار صد در صد افراد را مقاوم کردیم زیرا اگر حتی یک یا دو نفر از افراد واکسینه نشده مبتلا به بیماری شوند دیگر کسی نیست که به بیماری مبتلا شود آنها توسط افراد واکسینه شده احاطه شده اند بنابراین ۹۶ درصد به خوبی ۱۰۰ درصد عمل میکند برخی از دانشمندان تخمین زده اند اگر ۳۰ درصد از جمعیت ۱۰۰۰ تایی را تصادفی انتخاب کنید و این ۳۰۰ نفر را واکسینه کنیم چه اتفاقی خواهد افتاد آیا ایمن سازی سطحی جمعیت خواهیم داشت؟ جواب خیر است اما اگر شما این ۳۰۰ نفر را انتخاب کنید و از آن ها بخواهید که دوستانشان را معرفی کنند و از آن ها بخواهید که دوستانشان را معرفی کنند و دوستان این ۳۰۰ نفر را نیز واکسینه کنید همان میزان ایمنی جامعه را خواهیم داشت انگار که ۹۶ درصد جمعیت را با همان تاثیر واکسینه کرده اید به علاوه بودجه نیز به مقدار زیادی کاهش می یابد.

And similar ideas can be used, for instance, to target distribution of things like bed nets in the developing world. If we could understand the structure of networks in villages, we could target to whom to give the interventions to foster these kinds of spreads. Or, frankly, for advertising with all kinds of products. If we could understand how to target, it could affect the efficiency of what we're trying to achieve. And in fact, we can use data from all kinds of sources nowadays [to do this].

ایده های مشابهی میتواند مورد استفاده قرار گیرد، مثلا هدف قرار دادن توزیع پشه بند هدف قرار دادن توزیع پشه بند اگر بتوانیم ساختار شبکه های روستاها را درک کنیم متوجه خواهیم شد در کار چه کسی می توانیم مداخله کنیم تا این نوع شیوع را پرورش دهیم. یا اگر رک بخواهم بگویم برای تبلیغ هر نوع محصولی اگر بفهمیم که چه طور باید مورد هدف قرار دهیم روی کارایی آنچه سعی در به دست آوردنش داریم اثر خواهد گذاشت در واقع می توانیم از همه داده های منابع امروز برای انجام این کار استفاده کنیم

This is a map of eight million phone users in a European country. Every dot is a person, and every line represents a volume of calls between the people. And we can use such data, that's being passively obtained, to map these whole countries and understand who is located where within the network. Without actually having to query them at all, we can get this kind of a structural insight. And other sources of information, as you're no doubt aware are available about such features, from email interactions, online interactions, online social networks and so forth. And in fact, we are in the era of what I would call "massive-passive" data collection efforts. They're all kinds of ways we can use massively collected data to create sensor networks to follow the population, understand what's happening in the population, and intervene in the population for the better. Because these new technologies tell us not just who is talking to whom, but where everyone is, and what they're thinking based on what they're uploading on the Internet, and what they're buying based on their purchases. And all this administrative data can be pulled together and processed to understand human behavior in a way we never could before.

این نقشه 8 میلیون از کاربران موبایل در کشورهای اروپایی است هر نقطه نشانگر یک فرد است و هر خط نمایانگر میزان تماس های برقرار شده بین افراد می توانیم از چنین داده هایی که به صورت غیر مستقیم به دست آمده است استفاده کنیم تا نقشه همه کشور را ترسیم کنیم و بفهمیم که هر فردی بدون نیاز به تحقیق و پرس و جوی مستقیم کجای این شبکه قرار گرفته است و نوع ساختار درونی این ارتباطات را کشف خواهیم کرد سایر منابع اطلاعاتی که در این رابطه موجود است از تعاملات ایمیلی گرفته تا تعاملات آنلاین، شبکه های اجتماعی آنلاین و غیره. در واقع در دوره ای قرار داریم که من آن را تلاش برای جمع آوری داده عظیم و غیر مستقیم می نامم. همه این ها روش هایی غیر مستقیم برای جمع آوری داده هستند تا شبکه ای حساس بسازیم که به وسیله آن بتوان جمعیت را دنبال کرد و فهمید که چه اتفاقی در جمعیت در حال رخ دادن است و بتوان در جمعیت مداخله و وضعیت را بهبود بخشید زیرا این تکنولوژی های جدید نه تنها می گوید چه کسی با چه کسی صحبت میکند بلکه میگوید که هرنفر دقیقا کجاست و با توجه به مطالبی که آپلود می کنند متوجه میشویم که به چه چیزی فکر می کنند و چه خرید هایی انجام می دهند و می تواند همه این داده های اداری را یک جا جمع و پردازش کرد تا بتوان رفتار های انسانی را طوری فهمید که تا کنون نمیتوانستیم

So, for example, we could use truckers' purchases of fuel. So the truckers are just going about their business, and they're buying fuel. And we see a blip up in the truckers' purchases of fuel, and we know that a recession is about to end. Or we can monitor the velocity with which people are moving with their phones on a highway, and the phone company can see, as the velocity is slowing down, that there's a traffic jam. And they can feed that information back to their subscribers, but only to their subscribers on the same highway located behind the traffic jam! Or we can monitor doctors prescribing behaviors, passively, and see how the diffusion of innovation with pharmaceuticals occurs within [networks of] doctors. Or again, we can monitor purchasing behavior in people and watch how these types of phenomena can diffuse within human populations.

بنابراین می توانیم به عنوان مثال از خرید سوخت راننده های کامیون استفاده کنیم راننده های کامیون در کارشان سوخت می خرند و وقتی روی نمودار خرید سوخت راننده های کامیون افزایشی دیدیم می فهمیم که بحران اقتصادی در حال از بین رفتن است یا می توانیم سرعت مردم در بزرگراه ها را با استفاده از تلفن همراهشان نظارت کنیم و شرکت تلفن همراه می تواند ببیند که به محض اینکه سرعت پایین می آید راننده به ترافیک رسیده است و می توانند از این اطلاعات مشترکین خود استفاده کنند البته فقط مشترکینی که در همان بزرگراه هستند و پشت ترافیک قرار دارند یا حتی می توانیم نحوه نسخه نوشتن پزشکان را به طور غیر مستقیم نظارت کنیم و ببینیم که انتشار نوآوری در مورد داروها در شبکه پزشکان چگونه رخ می دهد همین طور می توانیم رفتار خرید مردم را نظارت کنیم و ببینیم این نوع پدیده ها چگونه در بین جمعیت پخش می شود

And there are three ways, I think, that these massive-passive data can be used. One is fully passive, like I just described -- as in, for instance, the trucker example, where we don't actually intervene in the population in any way. One is quasi-active, like the flu example I gave, where we get some people to nominate their friends and then passively monitor their friends -- do they have the flu, or not? -- and then get warning. Or another example would be, if you're a phone company, you figure out who's central in the network and you ask those people, "Look, will you just text us your fever every day? Just text us your temperature." And collect vast amounts of information about people's temperature, but from centrally located individuals. And be able, on a large scale, to monitor an impending epidemic with very minimal input from people. Or, finally, it can be more fully active -- as I know subsequent speakers will also talk about today -- where people might globally participate in wikis, or photographing, or monitoring elections, and upload information in a way that allows us to pool information in order to understand social processes and social phenomena.

و به نظر من سه راه برای استفاده از این داده های عظیم و غیرمستقیم وجود دارد یکی از این مثال ها کاملا غیر مستقیم بود مثل نمونه راننده های کامیون که ما در حقیقت هیچ ارتباطی با جمعیت نداریم یکی از این مثال ها نیمه مستقیم بود مثل مثال آنفولانزا که از برخی افراد می خواهیم که دوستانشان را معرفی کنند و غیر مستقیم دوستانشان را نظارت می کنیم آیا آنفولانزا گرفته اند یا خیر؟ سپس هشدار می دهیم یا مثلا اگر یک شرکت مخابراتی هستید، میفهمید که چه افرادی در مرکز این شبکه قرار دارند و به آن افراد می گویید: ممکن است هر روز درجه حرارت بدنتان را برایمان بفرستید؟ فقط کافیه دما را برایمان پیامک کنید و اطلاعات وسیعی در رابطه با درجه حرارت بدن افراد کسب خواهید کرد اما تنها از طریق افرادی که در مرکز شبکه قرار دارند و می توانید درمقیاس وسیعی با استفاد از کم ترین میزان دریافتی از مردم گسترش همه گیری یک بیماری رانظارت کنید یا در نهایت می تواند کاملا مستقیم باشد مثلا من می دانم که امروز سخنران های بعدی راجع به امروز صحبت خواهند کرد و همان طور که مردم ممکن است به طور جهانی در اموری مثل ویکی ها عکاسی و نظارت بر انتخابات شرکت کنند و اطلاعاتی را آپلود کنند که به ما اجازه دهد اطلاعات را یک جا جمع کنیم تا روند اجتماعی و پدیده های اجتماعی را بفهمیم

In fact, the availability of these data, I think, heralds a kind of new era of what I and others would like to call "computational social science." It's sort of like when Galileo invented -- or, didn't invent -- came to use a telescope and could see the heavens in a new way, or Leeuwenhoek became aware of the microscope -- or actually invented -- and could see biology in a new way. But now we have access to these kinds of data that allow us to understand social processes and social phenomena in an entirely new way that was never before possible. And with this science, we can understand how exactly the whole comes to be greater than the sum of its parts. And actually, we can use these insights to improve society and improve human well-being.

در واقع من فکر می کنم که در دسترس بودن این داده ها خبر از یک عصر جدید در در چیزی دارد که من و دیگران آن را علوم اجتماعی محاسباتی می نامیم این تقریبا شبیه به وقتی است که گالیله اختراع کرد -- یا اختراع نکرد بلکه تصمیم گرفت از تلسکوپ استفاده کند و توانست آسمان را طور دیگری نگاه کند یا لئوهنوک از وجود میکروسکوپ باخبر شد یا در واقع میکروسکوپ را اختراع کرد تا توانست زیست شناسی را طور دیگری نگاه کند اما حالا ما به همه این نوع داده ها دسترسی داریم که به ما این امکان را می دهد که روند اجتماعی و پدیده های اجتماعی را درک کنیم طوری که هیچ وقت برای ما امکان پذیر نبود و با استفاده از این علم می توانیم بفهمیم که دقیقا چطور کل بهتر است از جمع همه تکه ها و در واقع می توانیم از این بینش استفاده کنیم تا شرایط جامعه و انسان ها را بهبود بخشیم

Thank you.

متشکرم

Thank you.

متشکرم

Nicholas Christakis: How social networks predict epidemics

Nicholas Christakis: How social networks predict epidemics

Related talks

Nicholas Christakis: The hidden influence of social networks

Dan Dennett: Dangerous memes

Laurie Garrett: Lessons from the 1918 flu

Gary Slutkin: Let's treat violence like a contagious disease

Andreas Raptopoulos: No roads? There's a drone for that

Eric Berlow and Sean Gourley: Mapping ideas worth spreading

Related talks

Nicholas Christakis: The hidden influence of social networks

Dan Dennett: Dangerous memes

Laurie Garrett: Lessons from the 1918 flu

Gary Slutkin: Let's treat violence like a contagious disease

Andreas Raptopoulos: No roads? There's a drone for that

Eric Berlow and Sean Gourley: Mapping ideas worth spreading