Nicholas Christakis: How social networks predict epidemics

For the last 10 years, I've been spending my time trying to figure out how and why human beings assemble themselves into social networks. And the kind of social network I'm talking about is not the recent online variety, but rather, the kind of social networks that human beings have been assembling for hundreds of thousands of years, ever since we emerged from the African savannah. So, I form friendships and co-worker and sibling and relative relationships with other people who in turn have similar relationships with other people. And this spreads on out endlessly into a distance. And you get a network that looks like this. Every dot is a person. Every line between them is a relationship between two people -- different kinds of relationships. And you can get this kind of vast fabric of humanity, in which we're all embedded.

Последние 10 лет я провёл, пытаясь понять, как и почему люди объединяются в социальные сети. Социальная сеть, о которой я говорю, это не та, которая недавно появилась в интернете, а скорее та, в которую люди объединялись в течение сотен тысяч лет с момента возникновения человечества в Африканских саваннах. Я создаю дружеские и рабочие отношения, родственные связи с окружающими, которые, в свою очередь, образуют такие же связи с другими. И эта цепочка бесконечна. Мы получаем социальную сеть, которая выглядит вот так. Каждая точка - это человек. Каждая линия между точками - это отношения между двумя людьми, разные типы отношений. Таким образом можно получить пространственную структуру человечества, в которую включены мы все.

And my colleague, James Fowler and I have been studying for quite sometime what are the mathematical, social, biological and psychological rules that govern how these networks are assembled and what are the similar rules that govern how they operate, how they affect our lives. But recently, we've been wondering whether it might be possible to take advantage of this insight, to actually find ways to improve the world, to do something better, to actually fix things, not just understand things. So one of the first things we thought we would tackle would be how we go about predicting epidemics.

Некоторое время мы с моим коллегой Джеймсом Фаулером изучали, какие математические, социальные, биологические и психологические законы регламентируют создание этих сетей и какие похожие правила управляют ее функционированием и влиянием на нашу жизнь. До недавнего времени мы пытались понять, возможно ли использовать эти знания во благо, найти способ усовершенствовать мир, что-то улучшить, исправить, а не только понять. И первое, над чем мы начали работать, - это способ предсказания эпидемий.

And the current state of the art in predicting an epidemic -- if you're the CDC or some other national body -- is to sit in the middle where you are and collect data from physicians and laboratories in the field that report the prevalence or the incidence of certain conditions. So, so and so patients have been diagnosed with something, or other patients have been diagnosed, and all these data are fed into a central repository, with some delay. And if everything goes smoothly, one to two weeks from now you'll know where the epidemic was today. And actually, about a year or so ago, there was this promulgation of the idea of Google Flu Trends, with respect to the flu, where by looking at people's searching behavior today, we could know where the flu -- what the status of the epidemic was today, what's the prevalence of the epidemic today.

В настоящее время положение дел такое: представители СЭС или других национальных организаций сидят на своих местах и собирают данные, поступающие от врачей и полевых лабораторий, указывающие на распространение или наличие определённых заболеваний. Такие-то и такие-то случаи заболевания были диагностированы здесь, или другие случаи были диагностированы там, а данные об этом попали в центральную базу данных с задержкой. И если всё пройдёт гладко, то через неделю или две вы будете знать, на какой стадии эпидемия была сегодня. На самом деле около года назад была обнародована идея создания клиента Google Flu Trends, посвящённого гриппу, где, взглянув на поисковые запросы людей в данный момент, мы сможем узнать состояние эпидемии и степень её распространения в данный момент.

But what I'd like to show you today is a means by which we might get not just rapid warning about an epidemic, but also actually early detection of an epidemic. And, in fact, this idea can be used not just to predict epidemics of germs, but also to predict epidemics of all sorts of kinds. For example, anything that spreads by a form of social contagion could be understood in this way, from abstract ideas on the left like patriotism, or altruism, or religion to practices like dieting behavior, or book purchasing, or drinking, or bicycle-helmet [and] other safety practices, or products that people might buy, purchases of electronic goods, anything in which there's kind of an interpersonal spread. A kind of a diffusion of innovation could be understood and predicted by the mechanism I'm going to show you now.

Но сегодня я бы хотел показать вам способ, с помощью которого мы сможем не просто быстро узнавать об эпидемии, но и заранее обнаруживать её появление. На самом деле эту идею можно использовать не только для предсказания вирусной эпидемии, но и для предсказания любых других эпидемий. Например, таким образом можно понимать всё, что распространяется в социальной среде: от абстрактных идей вроде патриотизма, альтруизма и религии до таких действий, как следование диете, покупки книг, выпивка, использование велосипедного шлема и других мер безопасности, или товаров, которые люди могут купить, покупки электроники - всё, что распространяется межличностно. Распространение нововведения можно понять и предсказать с помощью механизма, который я вам сейчас покажу.

So, as all of you probably know, the classic way of thinking about this is the diffusion-of-innovation, or the adoption curve. So here on the Y-axis, we have the percent of the people affected, and on the X-axis, we have time. And at the very beginning, not too many people are affected, and you get this classic sigmoidal, or S-shaped, curve. And the reason for this shape is that at the very beginning, let's say one or two people are infected, or affected by the thing and then they affect, or infect, two people, who in turn affect four, eight, 16 and so forth, and you get the epidemic growth phase of the curve. And eventually, you saturate the population. There are fewer and fewer people who are still available that you might infect, and then you get the plateau of the curve, and you get this classic sigmoidal curve. And this holds for germs, ideas, product adoption, behaviors, and the like. But things don't just diffuse in human populations at random. They actually diffuse through networks. Because, as I said, we live our lives in networks, and these networks have a particular kind of a structure.

Итак, как вы, возможно, знаете, обычно его описывают как диффузию инновации или кривую восприятия. Здесь, на оси У, отмечается число попавших под воздействие людей, а на оси Х - время. В самом начале, пока еще количество попавших под воздействие не очень велико, получим классический сигмоид или S-образную кривую. А причина такой формы в том, что в самом начале допустим, один или два человека оказываются под влиянием или заражены, потом они влияют или заражают двух людей, которые, в свою очередь, влияют на 4, 8, 16 и так далее, и на кривой вы получите стадию роста эпидемии. В конце концов вы достигнете точки насыщения. Людей, которых можно заразить, становится всё меньше и меньше, у вас стадия стабилизации кривой и классическая сигмоидальная кривая. И это справедливо для бактерий, идей, восприятие товаров, моделей поведения и прочего. Однако всё это не распространяется в сообществе случайно. На самом деле оно распространяется через социальные сети. Потому что, как я сказал, мы живём в социальных сетях, и у каждой из этих сетей своя особенная структура.

Now if you look at a network like this -- this is 105 people. And the lines represent -- the dots are the people, and the lines represent friendship relationships. You might see that people occupy different locations within the network. And there are different kinds of relationships between the people. You could have friendship relationships, sibling relationships, spousal relationships, co-worker relationships, neighbor relationships and the like. And different sorts of things spread across different sorts of ties. For instance, sexually transmitted diseases will spread across sexual ties. Or, for instance, people's smoking behavior might be influenced by their friends. Or their altruistic or their charitable giving behavior might be influenced by their coworkers, or by their neighbors. But not all positions in the network are the same.

Давайте взглянем на эту сеть... Здесь 105 человек. Линии обозначают... Да, точки - это люди, а линии обозначают дружеские отношения. Можно заметить, что люди занимают разные положения внутри одной сети. И виды отношений между людьми отличаются. Вы можете вступить в дружеские, кровные, супружеские, рабочие, соседские отношения и множество других. И разные вещи распространяются через разные виды связей. Например, болезни, передаваемые половым путём, распространяются через сексуальные связи. Или, например, на то, что человек курит, могут повлиять его друзья. А на альтруизм и щедрость могут повлиять коллеги или соседи. Но не все положения в социальной сети одинаковы.

So if you look at this, you might immediately grasp that different people have different numbers of connections. Some people have one connection, some have two, some have six, some have 10 connections. And this is called the "degree" of a node, or the number of connections that a node has. But in addition, there's something else. So, if you look at nodes A and B, they both have six connections. But if you can see this image [of the network] from a bird's eye view, you can appreciate that there's something very different about nodes A and B. So, let me ask you this -- I can cultivate this intuition by asking a question -- who would you rather be if a deadly germ was spreading through the network, A or B? (Audience: B.) Nicholas Christakis: B, it's obvious. B is located on the edge of the network. Now, who would you rather be if a juicy piece of gossip were spreading through the network? A. And you have an immediate appreciation that A is going to be more likely to get the thing that's spreading and to get it sooner by virtue of their structural location within the network. A, in fact, is more central, and this can be formalized mathematically. So, if we want to track something that was spreading through a network, what we ideally would like to do is to set up sensors on the central individuals within the network, including node A, monitor those people that are right there in the middle of the network, and somehow get an early detection of whatever it is that is spreading through the network.

Так, если вы посмотрите на эту схему, то сразу поймёте, что у разных людей образуется разное число связей. У некоторых людей - одна, у кого-то - две, у кого-то - шесть, а у кого-то и десять. Это так называемая "степень" узла сети, или число связей в одном узле. Однако есть и другие различия. Так, если вы взглянете на узлы А и Б, вы увидите, что в них по шесть связей. Но если вы взглянете на схему с высоты птичьего полёта, вы заметите, что узлы А и Б чем-то всё же отличаются. Итак, позвольте мне задать вопрос - я помогу вам понять, задавая вопросы, - кем бы вы хотели быть, если бы по сети распространялся смертоносный вирус: А или Б? (Аудитория: Б.) Николас: Б, это очевидно. Узел Б находится на краю сети. А кем бы вы хотели быть, если бы сочная сплетня распространялась по той же сети? А. И вы сами сразу же поняли, что у А куда больше шансов подхватить то, что распространяется, быстрее благодаря структурному расположению узла внутри сети. Узел А расположен ближе к центру, и это может быть определено математически. Таким образом, если мы хотим отследить что-то, что распространяется по сети, в идеале нам надо бы установить сенсорное устройство на людей, находящихся в центре данной сети, включая узел А, следить за людьми, которые находятся прямо в центре сети, и таким образом распознать на ранней стадии элемент, распространяющийся по данной сети.

So if you saw them contract a germ or a piece of information, you would know that, soon enough, everybody was about to contract this germ or this piece of information. And this would be much better than monitoring six randomly chosen people, without reference to the structure of the population. And in fact, if you could do that, what you would see is something like this. On the left-hand panel, again, we have the S-shaped curve of adoption. In the dotted red line, we show what the adoption would be in the random people, and in the left-hand line, shifted to the left, we show what the adoption would be in the central individuals within the network. On the Y-axis is the cumulative instances of contagion, and on the X-axis is the time. And on the right-hand side, we show the same data, but here with daily incidence. And what we show here is -- like, here -- very few people are affected, more and more and more and up to here, and here's the peak of the epidemic. But shifted to the left is what's occurring in the central individuals. And this difference in time between the two is the early detection, the early warning we can get, about an impending epidemic in the human population.

То есть, если вы видите, что они подхватили микроб или какую-то информацию, вы поймёте, что довольно скоро этот микроб или информацию подхватят все. И это гораздо лучше, чем следить за шестью случайно выбранными людьми безотносительно к структуре сообщества. Действительно, если вам удастся это сделать, вы увидите что-то подобное. Слева мы снова видим S-образную кривую восприятия. Красной пунктирной линией мы показываем закономерность восприятия случайных людей, а линией, расположенной слева, мы показываем, каким бы было восприятие у людей, расположенных в центре сети. На оси Y обозначены все случаи заражения, на оси Х - время. Справа - те же данные, но с ежедневным интервалом. Мы показываем, что здесь очень мало людей оказалось под воздействием; до этого места больше и больше, а вот здесь находится пик эпидемии. Однако слева мы видим, что происходит с людьми в центре сети. И разница во времени между двумя показателями может обеспечить заблаговременное распознавание и предупреждение о приближающейся эпидемии в сообществе.

The problem, however, is that mapping human social networks is not always possible. It can be expensive, not feasible, unethical, or, frankly, just not possible to do such a thing. So, how can we figure out who the central people are in a network without actually mapping the network? What we came up with was an idea to exploit an old fact, or a known fact, about social networks, which goes like this: Do you know that your friends have more friends than you do? Your friends have more friends than you do, and this is known as the friendship paradox. Imagine a very popular person in the social network -- like a party host who has hundreds of friends -- and a misanthrope who has just one friend, and you pick someone at random from the population; they were much more likely to know the party host. And if they nominate the party host as their friend, that party host has a hundred friends, therefore, has more friends than they do. And this, in essence, is what's known as the friendship paradox. The friends of randomly chosen people have higher degree, and are more central than the random people themselves.

Однако, проблема в том, что сделать карту человеческих социальных сетей не всегда возможно. Это может оказаться затратным, очень сложным, неэтичным, или, по правде говоря, просто невозможным. Итак, как же мы можем вычислить людей, находящихся в центре социальной сети, без составления её карты? У нас появилась идея воспользоваться старым, давно известным фактом о социальных сетях, который звучит так: "Знаете ли вы, что у ваших друзей больше друзей, чем у вас? У ваших друзей больше друзей, чем у вас самих". Этот феномен известен как парадокс дружбы. Представьте себе очень популярного в социальной сети человека, вроде любителя вечеринок, у которого сотни друзей, и мизантропа, у которого лишь один друг, и если вы случайным образом выберете кого-то из сообщества, скорее всего, он знает этого любителя вечеринок. Если он назовёт любителя вечеринок своим другом, а у того сотня друзей, следовательно, у любителя вечеринок больше друзей, чем у выбранного нами человека. И, по сути, это и есть парадокс дружбы. У друзей случайно выбранных людей "степень" узла выше и они ближе к центру сети, чем сами случайно выбранные люди.

And you can get an intuitive appreciation for this if you imagine just the people at the perimeter of the network. If you pick this person, the only friend they have to nominate is this person, who, by construction, must have at least two and typically more friends. And that happens at every peripheral node. And in fact, it happens throughout the network as you move in, everyone you pick, when they nominate a random -- when a random person nominates a friend of theirs, you move closer to the center of the network. So, we thought we would exploit this idea in order to study whether we could predict phenomena within networks. Because now, with this idea we can take a random sample of people, have them nominate their friends, those friends would be more central, and we could do this without having to map the network.

Это становится наглядно, если представить себе людей, находящихся на границах сети. Если вы выберете этого человека, единственный друг, которого он сможет назвать, будет этот человек, у которого, согласно структуре, будет по меньшей мере два, а то и больше друзей. Так происходит с любым узлом на периферии. И так происходит всё время, пока вы приближаетесь к центру, если каждый, кого вы выберете, называет случайного... когда случайный человек называет своего друга, вы чуть больше приближаетесь к центру сети. Итак, мы подумали, что можно воспользоваться этой идеей, чтобы посмотреть, сможем ли мы предсказывать явления, происходящие внутри сети. Ведь сейчас по этой модели мы можем выбрать несколько случайных людей, попросить их назвать своих друзей, эти друзья окажутся ближе к центру сети и нам не понадобится карта самой сети.

And we tested this idea with an outbreak of H1N1 flu at Harvard College in the fall and winter of 2009, just a few months ago. We took 1,300 randomly selected undergraduates, we had them nominate their friends, and we followed both the random students and their friends daily in time to see whether or not they had the flu epidemic. And we did this passively by looking at whether or not they'd gone to university health services. And also, we had them [actively] email us a couple of times a week. Exactly what we predicted happened. So the random group is in the red line. The epidemic in the friends group has shifted to the left, over here. And the difference in the two is 16 days. By monitoring the friends group, we could get 16 days advance warning of an impending epidemic in this human population.

Мы протестировали данную модель во время эпидемии свиного гриппа в Гарвардском колледже осенью и зимой 2009 года, всего несколько месяцев назад. Мы взяли 1300 случайно выбранных студентов, попросили их назвать своих друзей, а затем следили как за случайно выбранными студентами, так и за их друзьями каждый день, чтобы узнать, не заболели ли они гриппом. Мы делали это пассивно, проверяя, не обратились ли они в медпункт университета. Помимо этого мы попросили их отправлять нам электронное письмо пару раз в неделю. И случилось как раз то, чего мы ожидали. Итак, группа случайно выбранных студентов - это красная линия. А эпидемия в группе друзей изображена слева, вот здесь. И разница между двумя показателями - 16 дней. Следя за группой друзей, мы смогли получить предупреждение о приближающейся эпидемии в данном сообществе за 16 дней.

Now, in addition to that, if you were an analyst who was trying to study an epidemic or to predict the adoption of a product, for example, what you could do is you could pick a random sample of the population, also have them nominate their friends and follow the friends and follow both the randoms and the friends. Among the friends, the first evidence you saw of a blip above zero in adoption of the innovation, for example, would be evidence of an impending epidemic. Or you could see the first time the two curves diverged, as shown on the left. When did the randoms -- when did the friends take off and leave the randoms, and [when did] their curve start shifting? And that, as indicated by the white line, occurred 46 days before the peak of the epidemic. So this would be a technique whereby we could get more than a month-and-a-half warning about a flu epidemic in a particular population.

Помимо этого, если бы вы были аналитиком, изучающим эпидемию, или, скажем, предсказывающим восприятие какого-то товара, вы могли бы случайным образом выбрать несколько человек, попросить их назвать своих друзей и следить за друзьями, а также за случайной группой. Как только вы увидите, что среди друзей отметка восприятия новинки поднялась выше нуля, это будет свидетельством приближающейся эпидемии. Или как только вы увидите, что две кривые разошлись, как это показано слева. Когда кривая друзей взлетела намного выше случайно выбранной группы и когда эта кривая начала меняться? Это, как показано белой линией, случилось за 46 дней до пика эпидемии. Такова методика, с помощью которой мы могли бы получать предупреждения об эпидемии гриппа в определённом сообществе раньше, чем за полтора месяца.

I should say that how far advanced a notice one might get about something depends on a host of factors. It could depend on the nature of the pathogen -- different pathogens, using this technique, you'd get different warning -- or other phenomena that are spreading, or frankly, on the structure of the human network. Now in our case, although it wasn't necessary, we could also actually map the network of the students.

Необходимо отметить, что быстрота получения предупреждения зависит от множества факторов. Они могут зависеть от природы болезнетворного микроорганизма - при разных болезнетворных организмах, используя эту схему, вы получите разные предупреждения, - или от явлений, или от структуры социальной сети. Итак, в нашем случае, хотя это и не было необходимо, мы смогли создать и саму карту студентов.

So, this is a map of 714 students and their friendship ties. And in a minute now, I'm going to put this map into motion. We're going to take daily cuts through the network for 120 days. The red dots are going to be cases of the flu, and the yellow dots are going to be friends of the people with the flu. And the size of the dots is going to be proportional to how many of their friends have the flu. So bigger dots mean more of your friends have the flu. And if you look at this image -- here we are now in September the 13th -- you're going to see a few cases light up. You're going to see kind of blooming of the flu in the middle. Here we are on October the 19th. The slope of the epidemic curve is approaching now, in November. Bang, bang, bang, bang, bang -- you're going to see lots of blooming in the middle, and then you're going to see a sort of leveling off, fewer and fewer cases towards the end of December. And this type of a visualization can show that epidemics like this take root and affect central individuals first, before they affect others.

Это карта 714 студентов и их дружеских связей. И прямо сейчас я приведу карту в движение. Мы увидим ежедневное изображение сети на протяжении 120 дней. Красные точки - это случаи заболевания гриппом, жёлтые - друзья заболевших. А размер точек будет прямо пропорциональным количеству заболевших друзей данного человека. Итак, чем больше будут точки, тем больше друзей заболело. Взгляните на эту картинку - это данные на 13 сентября - вы увидите, что уже появилось несколько случаев заболевания. Вы увидите пик эпидемии гриппа в центре сети. Вот данные на 19 октября. Сейчас, в ноябре, кривая эпидемии достигает вершины. Бам, бам, бам, бам, бам - вы увидите множество пиков в середине, а потом - что-то вроде угасания, все меньше и меньше случаев в концу декабря. Такой способ визуализации демонстрирует, что подобные эпидемии появляются и воздействуют сначала на людей в центре сети, а потом уже на всех остальных.

Now, as I've been suggesting, this method is not restricted to germs, but actually to anything that spreads in populations. Information spreads in populations, norms can spread in populations, behaviors can spread in populations. And by behaviors, I can mean things like criminal behavior, or voting behavior, or health care behavior, like smoking, or vaccination, or product adoption, or other kinds of behaviors that relate to interpersonal influence. If I'm likely to do something that affects others around me, this technique can get early warning or early detection about the adoption within the population. The key thing is that for it to work, there has to be interpersonal influence. It cannot be because of some broadcast mechanism affecting everyone uniformly.

Как я уже упомянул, данный метод не ограничивается только болезнями, но может применяться ко всему, что распространяется в сообществе. Например, к информации. К правилам. К поведению. Под поведением я понимаю такие вещи, как преступное поведение, поведение при голосовании, или заботе о здоровье: вроде курения или вакцинации, или восприятие каких-то товаров, или другие типы поведения, связанные с влиянием людей друг на друга. Если есть вероятность, что мои действия повлияют на окружающих, данный метод может на ранней стадии определить, как это будет принято в сообществе. Чтобы метод работал, в его основе должно быть влияние людей друг на друга. Он не будет работать от широковещания, действующего на всех одинаково.

Now the same insights can also be exploited -- with respect to networks -- can also be exploited in other ways, for example, in the use of targeting specific people for interventions. So, for example, most of you are probably familiar with the notion of herd immunity. So, if we have a population of a thousand people, and we want to make the population immune to a pathogen, we don't have to immunize every single person. If we immunize 960 of them, it's as if we had immunized a hundred [percent] of them. Because even if one or two of the non-immune people gets infected, there's no one for them to infect. They are surrounded by immunized people. So 96 percent is as good as 100 percent. Well, some other scientists have estimated what would happen if you took a 30 percent random sample of these 1000 people, 300 people and immunized them. Would you get any population-level immunity? And the answer is no. But if you took this 30 percent, these 300 people and had them nominate their friends and took the same number of vaccine doses and vaccinated the friends of the 300 -- the 300 friends -- you can get the same level of herd immunity as if you had vaccinated 96 percent of the population at a much greater efficiency, with a strict budget constraint.

Теперь можно развить те же мысли - относительно сетей - в других областях: например, отбирая особых людей для вмешательства. Так, например, многие из вас, наверное, знакомы с понятием коллективного иммунитета. Если мы занимаемся сообществом из тысячи людей и хотим защитить его от болезнетворного микроорганизма, нам не нужно проводить вакцинацию каждого отдельного человека. Если мы иммунизируем 960 из них, получится, что мы сделали невосприимчивыми к инфекции 100% населения. Потому что, даже если один-два человека без иммунитета заразятся, им некого будет заражать. Они окружены людьми с иммунитетом. То есть 96% - это почти 100%. Учёные уже подсчитали, что произойдёт, если вы выберете 30% случайных людей из этой тысячи - 300 человек - и проведёте их вакцинацию. Можно ли говорить о коллективном иммунитете? Нет. Однако если вы возьмёте эти 30%, эти 300 человек, попросите их назвать своих друзей и тем же самым количеством инъекций вакцинируете друзей этих 300 человек - 300 друзей - вы сможете получить коллективный иммунитет такого же уровня, как если бы вы вакцинировали 96% всего сообщества с ещё большей эффективностью и меньшими затратами.

And similar ideas can be used, for instance, to target distribution of things like bed nets in the developing world. If we could understand the structure of networks in villages, we could target to whom to give the interventions to foster these kinds of spreads. Or, frankly, for advertising with all kinds of products. If we could understand how to target, it could affect the efficiency of what we're trying to achieve. And in fact, we can use data from all kinds of sources nowadays [to do this].

Такая же схема может использоваться, например, для целевого распространения таких товаров, как пологи над кроватями в развивающемся мире. Если бы мы смогли понять структуру сетей в деревнях, мы могли бы выбрать, кому поручить заботы об их распространении. Или рекламу всех товаров. Если мы сможем понять, как выбирать таких людей, мы сможем повлиять на продуктивность результата, который пытаемся достичь. Можно использовать данные всех существующих сегодня ресурсов.

This is a map of eight million phone users in a European country. Every dot is a person, and every line represents a volume of calls between the people. And we can use such data, that's being passively obtained, to map these whole countries and understand who is located where within the network. Without actually having to query them at all, we can get this kind of a structural insight. And other sources of information, as you're no doubt aware are available about such features, from email interactions, online interactions, online social networks and so forth. And in fact, we are in the era of what I would call "massive-passive" data collection efforts. They're all kinds of ways we can use massively collected data to create sensor networks to follow the population, understand what's happening in the population, and intervene in the population for the better. Because these new technologies tell us not just who is talking to whom, but where everyone is, and what they're thinking based on what they're uploading on the Internet, and what they're buying based on their purchases. And all this administrative data can be pulled together and processed to understand human behavior in a way we never could before.

Вот карта восьми миллионов телефонных абонентов в одной из европейских стран. Каждая точка - это человек, а каждая линия показывает количество звонков между людьми. Мы можем использовать такие пассивно полученные данные, чтобы сделать карту целых стран и понять, кто где расположен внутри сети. И нет необходимости их всех опрашивать, чтобы получить структурный анализ такого рода. Существуют и другие источники информации, о которых вы, конечно же, знаете: общение по электронной почте, взаимодействия в интернете, социальные сети в интернете и так далее. Мы живём в эпоху, которую можно назвать широкомасштабным пассивным сбором информации. Существуют разнообразные способы для использования всей собранной информации, чтобы создать сенсорную сеть для наблюдения за сообществом, понимания, что в нём происходит, и вмешательства для улучшений. Ведь эти новые технологии могут нам объяснить, не только кто с кем общается, но и кто где находится, и о чём люди думают, судя по тому, что они скачивают из интернета, и что они покупают, судя по их покупкам. Можно собрать всю эту административную информацию и обработать её, чтобы понять человеческое поведение так, как это раньше было невозможно.

So, for example, we could use truckers' purchases of fuel. So the truckers are just going about their business, and they're buying fuel. And we see a blip up in the truckers' purchases of fuel, and we know that a recession is about to end. Or we can monitor the velocity with which people are moving with their phones on a highway, and the phone company can see, as the velocity is slowing down, that there's a traffic jam. And they can feed that information back to their subscribers, but only to their subscribers on the same highway located behind the traffic jam! Or we can monitor doctors prescribing behaviors, passively, and see how the diffusion of innovation with pharmaceuticals occurs within [networks of] doctors. Or again, we can monitor purchasing behavior in people and watch how these types of phenomena can diffuse within human populations.

Например, возьмём покупку бензина водителями грузовиков. Водители занимаются своим делом и покупают бензин. Мы видим увеличение покупок бензина и понимаем, что экономический спад скоро закончится. Или мы можем следить за скоростью, с которой люди с телефонами перемещаются по дорогам, а телефонная компания видит, что если скорость падает, это означает, что на дорогах пробки. Они могут предоставлять эту информацию своим клиентам, но только тем, которые движутся по той же дороге, приближаясь к пробке. Или мы можем следить за тем, как врачи выписывают лекарства, и заметить, что распространение новых препаратов происходит внутри определённых врачебных сетей. Или, опять же, мы может отслеживать поведение покупателей и наблюдать, как это явление распространяется в сообществе.

And there are three ways, I think, that these massive-passive data can be used. One is fully passive, like I just described -- as in, for instance, the trucker example, where we don't actually intervene in the population in any way. One is quasi-active, like the flu example I gave, where we get some people to nominate their friends and then passively monitor their friends -- do they have the flu, or not? -- and then get warning. Or another example would be, if you're a phone company, you figure out who's central in the network and you ask those people, "Look, will you just text us your fever every day? Just text us your temperature." And collect vast amounts of information about people's temperature, but from centrally located individuals. And be able, on a large scale, to monitor an impending epidemic with very minimal input from people. Or, finally, it can be more fully active -- as I know subsequent speakers will also talk about today -- where people might globally participate in wikis, or photographing, or monitoring elections, and upload information in a way that allows us to pool information in order to understand social processes and social phenomena.

Мне кажется, существует три способа, в которых эти пассивно собранные данные могут использоваться. Один из них полностью пассивен, вроде того, что я только что объяснил: как в случае с водителями грузовиков, где мы никоим образом не вмешиваемся в жизнь сообщества. Ещё один - псевдоактивный, как в случае с эпидемией гриппа, где мы просили людей назвать своих друзей, затем пассивно следили за друзьями, заболели они гриппом или нет, и получали предупреждение о болезни. Еще один пример: вы, телефонная компания, разыскиваете людей в центре сетей и просите их: "Вы не могли бы каждый день сообщать нам о своей температуре? Просто пришлите нам цифру". И собираете огромное количество информации о температуре людей, находящихся в центре сети. Так вы сможете в огромном масштабе отследить приближающуюся эпидемию с минимальной информацией от людей. И, наконец, существует более активный способ - насколько я знаю, о нём ещё будут сегодня говорить, - когда люди по всему миру участвуют в создании вики-сайтов, фотографируют, наблюдают за выборами и загружают информацию, так что её можно объединить для понимания социальных процессов и явлений.

In fact, the availability of these data, I think, heralds a kind of new era of what I and others would like to call "computational social science." It's sort of like when Galileo invented -- or, didn't invent -- came to use a telescope and could see the heavens in a new way, or Leeuwenhoek became aware of the microscope -- or actually invented -- and could see biology in a new way. But now we have access to these kinds of data that allow us to understand social processes and social phenomena in an entirely new way that was never before possible. And with this science, we can understand how exactly the whole comes to be greater than the sum of its parts. And actually, we can use these insights to improve society and improve human well-being.

Мне кажется, что доступность такой информации является вестником новой эпохи, которую я бы назвал эпохой "вычислительной социальной науки". Это похоже на то, как Галилей изобрёл (или не изобретал), начал использовать телескоп и смог увидеть небеса по-новому; или как Левенгук узнал о микроскопе (или сам изобрёл) и смог по-новому взглянуть на биологию. Но теперь, когда у нас есть доступ к любой информации, мы сможем понять социальные процессы и явления совершенно новым, невозможным раньше, способом. С помощью этой науки мы сможем понять, каким образом целое оказывается больше суммы своих элементов. Мы сможем использовать эти открытия, чтобы улучшить общество и благополучие людей.

Thank you.

Спасибо.

Thank you.

Спасибо.

Nicholas Christakis: How social networks predict epidemics

Nicholas Christakis: How social networks predict epidemics

Related talks

Nicholas Christakis: The hidden influence of social networks

Dan Dennett: Dangerous memes

Laurie Garrett: Lessons from the 1918 flu

Gary Slutkin: Let's treat violence like a contagious disease

Andreas Raptopoulos: No roads? There's a drone for that

Eric Berlow and Sean Gourley: Mapping ideas worth spreading

Related talks

Nicholas Christakis: The hidden influence of social networks

Dan Dennett: Dangerous memes

Laurie Garrett: Lessons from the 1918 flu

Gary Slutkin: Let's treat violence like a contagious disease

Andreas Raptopoulos: No roads? There's a drone for that

Eric Berlow and Sean Gourley: Mapping ideas worth spreading