Nicholas Christakis: How social networks predict epidemics

For the last 10 years, I've been spending my time trying to figure out how and why human beings assemble themselves into social networks. And the kind of social network I'm talking about is not the recent online variety, but rather, the kind of social networks that human beings have been assembling for hundreds of thousands of years, ever since we emerged from the African savannah. So, I form friendships and co-worker and sibling and relative relationships with other people who in turn have similar relationships with other people. And this spreads on out endlessly into a distance. And you get a network that looks like this. Every dot is a person. Every line between them is a relationship between two people -- different kinds of relationships. And you can get this kind of vast fabric of humanity, in which we're all embedded.

Nos últimos 10 anos, passei o meu tempo a tentar perceber como e porquê os seres humanos constroem redes sociais. E o tipo de rede social de que estou a falar não é a recente versão online, mas antes o tipo de redes sociais que os seres humanos têm vindo a construir há centenas de milhares de anos, desde que nós emergimos [para] a savana africana. Assim, eu formo relações de amizade e de colega e de irmão e de parentesco com outras pessoas que por sua vez têm relações semelhantes com outras pessoas. e isto espalha-se sem fim a perder de vista. E obtemos uma rede que se parece com isto. Cada ponto é uma pessoa. Cada linha entre eles é uma relação entre duas pessoas - diferentes tipos de relações. E obtemos este vasto tecido de humanidade, no qual estamos todos envolvidos.

And my colleague, James Fowler and I have been studying for quite sometime what are the mathematical, social, biological and psychological rules that govern how these networks are assembled and what are the similar rules that govern how they operate, how they affect our lives. But recently, we've been wondering whether it might be possible to take advantage of this insight, to actually find ways to improve the world, to do something better, to actually fix things, not just understand things. So one of the first things we thought we would tackle would be how we go about predicting epidemics.

E o meu colega, James Fowler e eu temos estudado há bastante tempo quais as regras matemáticas, sociais, biológicas e psicológicas que governam como estas redes são criadas e quais as regras semelhantes que governam como operam, como afectam as nossas vidas. E recentemente, tenho-me questionado se é possível tirar partido deste conhecimento, para, na realidade, encontrar formas de melhorar o mundo, para fazer algo melhor, para na realidade corrigir as coisas, não apenas perceber as coisas. Uma das primeiras coisas que pensamos que deveríamos abordar seria como conseguiríamos prever as epidemias.

And the current state of the art in predicting an epidemic -- if you're the CDC or some other national body -- is to sit in the middle where you are and collect data from physicians and laboratories in the field that report the prevalence or the incidence of certain conditions. So, so and so patients have been diagnosed with something, or other patients have been diagnosed, and all these data are fed into a central repository, with some delay. And if everything goes smoothly, one to two weeks from now you'll know where the epidemic was today. And actually, about a year or so ago, there was this promulgation of the idea of Google Flu Trends, with respect to the flu, where by looking at people's searching behavior today, we could know where the flu -- what the status of the epidemic was today, what's the prevalence of the epidemic today.

E o actual estado-da-arte na previsão de epidemias - se forem o CDC ou outra entidade nacional - é sentar-se no meio onde estás e recolher dados de médicos e laboratórios na área que reportam a prevalência ou incidência de certas condições. Tantos e tantos pacientes foram diagnosticados com algo [aqui] ou outros pacientes foram diagnosticados [além], e toda esta informação é colocada num repositório central, com algum atraso. E se tudo corre bem, daqui a uma ou duas semanas, saberás onde a epidemia está hoje. E na realidade, há cerca de um ano atrás, houve a promulgação da ideia de Tendência da Gripe Google, no que respeita à gripe, onde, analisando o comportamento de pesquisa das pessoas hoje, podiamos saber onde a gripe... qual a situação da epidemia hoje, qual a prevalência da epidemia hoje.

But what I'd like to show you today is a means by which we might get not just rapid warning about an epidemic, but also actually early detection of an epidemic. And, in fact, this idea can be used not just to predict epidemics of germs, but also to predict epidemics of all sorts of kinds. For example, anything that spreads by a form of social contagion could be understood in this way, from abstract ideas on the left like patriotism, or altruism, or religion to practices like dieting behavior, or book purchasing, or drinking, or bicycle-helmet [and] other safety practices, or products that people might buy, purchases of electronic goods, anything in which there's kind of an interpersonal spread. A kind of a diffusion of innovation could be understood and predicted by the mechanism I'm going to show you now.

Mas o que gostaria de vos mostrar hoje é um meio através do qual podemos obter não apenas um aviso rápido acerca de uma epidemia, mas também uma detecção rápida da epidemia. E, de facto, esta ideia pode ser utilizada não apenas para prever epidemias de germes mas também para prever epidemias de todo o tipo de coisas. Por exemplo, qualquer coisa que se espalhe por uma qualquer forma de contágio social pode ser compreendida desta maneira, desde ideias abstractas do lado esquerdo como patriotismo ou altruísmo ou religião, a práticas como comportamentos de dieta ou compra de livros, ou beber ou capacetes de bicicleta e outras medidas de segurança, ou produtos que as pessoas possam comprar, compra de equipamentos electrónicos, qualquer coisa em que exista uma forma de contágio interpessoal. Uma espécie de difusão da inovação pode ser compreendida e prevista através do mecanismo que vos vou mostrar.

So, as all of you probably know, the classic way of thinking about this is the diffusion-of-innovation, or the adoption curve. So here on the Y-axis, we have the percent of the people affected, and on the X-axis, we have time. And at the very beginning, not too many people are affected, and you get this classic sigmoidal, or S-shaped, curve. And the reason for this shape is that at the very beginning, let's say one or two people are infected, or affected by the thing and then they affect, or infect, two people, who in turn affect four, eight, 16 and so forth, and you get the epidemic growth phase of the curve. And eventually, you saturate the population. There are fewer and fewer people who are still available that you might infect, and then you get the plateau of the curve, and you get this classic sigmoidal curve. And this holds for germs, ideas, product adoption, behaviors, and the like. But things don't just diffuse in human populations at random. They actually diffuse through networks. Because, as I said, we live our lives in networks, and these networks have a particular kind of a structure.

Como talvez saibam, a forma clássica de pensar sobre isto é a difusão-da-inovação, ou a curva da adopção. Aqui no eixo do Y, temos a percentagem de pessoas afectadas, e no eixo X, temos o tempo. E no início, muito poucas pessoas estão afectadas, e obtemos esta curva sigmóide, ou em forma de S. E a razão para esta forma é que no início, digamos uma ou duas pessoas são afectadas ou infectadas, pela coisa, e depois afectam, ou infectam, duas pessoas, que depois afectam quatro, oito, 16 e assim sucessivamente, e obtemos a fase de crescimento da epidemia da curva. E, eventualmente, saturamos a população. Restam cada vez menos pessoas disponíveis para infectar, e obtemos o planalto da curva, e obtemos esta curva sigmóide clássica. E isto é válido para germes, ideias, adopção de produto, comportamentos e afins. Mas as coisas não se difundem nas populações humanas ao acaso. Elas difundem-se através de redes. Porque, como disse, vivemos as nossas vidas em redes, e estas redes têm um tipo particular de estrutura.

Now if you look at a network like this -- this is 105 people. And the lines represent -- the dots are the people, and the lines represent friendship relationships. You might see that people occupy different locations within the network. And there are different kinds of relationships between the people. You could have friendship relationships, sibling relationships, spousal relationships, co-worker relationships, neighbor relationships and the like. And different sorts of things spread across different sorts of ties. For instance, sexually transmitted diseases will spread across sexual ties. Or, for instance, people's smoking behavior might be influenced by their friends. Or their altruistic or their charitable giving behavior might be influenced by their coworkers, or by their neighbors. But not all positions in the network are the same.

Se olharem para uma rede como esta... Isto são 105 pessoas. E as linhas representam...os pontos são as pessoas, e as linhas representam relações de amizade. Podem ver que as pessoas ocupam diferentes localizações dentro da rede. E existem diferentes tipos de relações entre pessoas. Podemos ter relações de amizade, relações de irmandade, relações conjugais, relações de colegas, relações de vizinhança e afins. E diferentes tipos de coisas espalham-se através de diferentes tipos de ligações. Por exemplo, doenças sexualmente transmissíveis espalhar-se-ão por ligações sexuais. Ou, por exemplo, o comportamento de fumar das pessoas pode ser influenciado pelos seus amigos. Ou o seu comportamento altruísta ou de fazer caridade pode ser influenciado pelos seus colegas, ou pelos seus vizinhos. Mas nem todas as posições na rede são as mesmas.

So if you look at this, you might immediately grasp that different people have different numbers of connections. Some people have one connection, some have two, some have six, some have 10 connections. And this is called the "degree" of a node, or the number of connections that a node has. But in addition, there's something else. So, if you look at nodes A and B, they both have six connections. But if you can see this image [of the network] from a bird's eye view, you can appreciate that there's something very different about nodes A and B. So, let me ask you this -- I can cultivate this intuition by asking a question -- who would you rather be if a deadly germ was spreading through the network, A or B? (Audience: B.) Nicholas Christakis: B, it's obvious. B is located on the edge of the network. Now, who would you rather be if a juicy piece of gossip were spreading through the network? A. And you have an immediate appreciation that A is going to be more likely to get the thing that's spreading and to get it sooner by virtue of their structural location within the network. A, in fact, is more central, and this can be formalized mathematically. So, if we want to track something that was spreading through a network, what we ideally would like to do is to set up sensors on the central individuals within the network, including node A, monitor those people that are right there in the middle of the network, and somehow get an early detection of whatever it is that is spreading through the network.

Assim se virem isto, podem imediatamente perceber que pessoas diferentes têm números diferentes de conexões. Algumas pessoas têm uma conexão, algumas têm duas, algumas têm seis, algumas têm 10 conexões. E isto é chamado o "grau" do nó, ou o número de conexões que um nó tem. Mas, adicionalmente, existe mais qualquer coisa. Assim, se olharem para os nós A e B, ambos têm seis conexões. Mas podem ver esta imagem [da rede] da perspectiva de um pássaro, e podem apreciar que existe algo completamente diferente acerca dos nós A e B. Deixem-me perguntar-vos isto - posso cultivar esta intuição fazendo uma pergunta - quem preferiam ser se um germe mortífero se estivesse a espalhar através da rede, A ou B? (Audiência: B) Nicholas Christakis: B, obviamente. B está localizado no extremo da rede. Agora, quem gostariam de ser se um mexerico interessante estivesse a espalhar-se através da rede? A. E têm uma percepção imediata de que A irá provavelmente ter a coisa que se está a espalhar e tê-la mais cedo em virtude da sua localização estrutural dentro da rede. A, de facto, é mais central, e isto pode ser formalizado matematicamente. Por isso, se querem seguir algo que se propague por uma rede, o que preferencialmente gostamos de fazer é preparar sensores nos indivíduos centrais dentro da rede, incluindo o nó A, monitorizar essas pessoas que estão lá no meio da rede, e que de alguma forma têm uma detecção atempada do que quer que se esteja a espalhar pela rede

So if you saw them contract a germ or a piece of information, you would know that, soon enough, everybody was about to contract this germ or this piece of information. And this would be much better than monitoring six randomly chosen people, without reference to the structure of the population. And in fact, if you could do that, what you would see is something like this. On the left-hand panel, again, we have the S-shaped curve of adoption. In the dotted red line, we show what the adoption would be in the random people, and in the left-hand line, shifted to the left, we show what the adoption would be in the central individuals within the network. On the Y-axis is the cumulative instances of contagion, and on the X-axis is the time. And on the right-hand side, we show the same data, but here with daily incidence. And what we show here is -- like, here -- very few people are affected, more and more and more and up to here, and here's the peak of the epidemic. But shifted to the left is what's occurring in the central individuals. And this difference in time between the two is the early detection, the early warning we can get, about an impending epidemic in the human population.

Isto é, se os vêem contrair um germe ou um pedaço de informação, saberão que, brevemente, toda a gente está prestes a contrair esse germe ou esse pedaço de informação. E isto seria muito melhor do que monitorizar seis pessoas escolhidas aleatóriamente, sem referência à estrutura da população. E de facto, se pudéssemos fazê-lo o que veríamos seria algo como isto. No painel do lado esquerdo, mais uma vez, temos a curva de adopção em forma de S. Na linha pontilhada a vermelho, mostramos qual a adopção seria em pessoas aleatórias, e na linha à esquerda, inclinada para a esquerda mostramos qual seria a adopção com os indivíduos centrais dentro da rede. No eixo Y estão as instâncias de contágio acumuladas, e no eixo X está o tempo. E no lado direito, mostramos os mesmos dados, mas aqui com incidência diária. E o que mostramos aqui é - como, aqui - muito poucas pessoas são afectadas, e mais e mais e mais até aqui, e aqui está o pico da epidemia. Mas deslocado para a esquerda está o que ocorre com os indivíduos centrais. E esta diferença no tempo entre as duas é a detecção atempada, o aviso antecipado que podemos obter, acerca de uma epidemia iminente na população humana.

The problem, however, is that mapping human social networks is not always possible. It can be expensive, not feasible, unethical, or, frankly, just not possible to do such a thing. So, how can we figure out who the central people are in a network without actually mapping the network? What we came up with was an idea to exploit an old fact, or a known fact, about social networks, which goes like this: Do you know that your friends have more friends than you do? Your friends have more friends than you do, and this is known as the friendship paradox. Imagine a very popular person in the social network -- like a party host who has hundreds of friends -- and a misanthrope who has just one friend, and you pick someone at random from the population; they were much more likely to know the party host. And if they nominate the party host as their friend, that party host has a hundred friends, therefore, has more friends than they do. And this, in essence, is what's known as the friendship paradox. The friends of randomly chosen people have higher degree, and are more central than the random people themselves.

O problema, no entanto, é que mapear as redes sociais humanas nem sempre é possível, Pode ser caro, [muito difícil], pouco ético, ou, francamente, mesmo impossível de se fazer. Então, como podemos descobrir quem são as pessoas centrais numa rede sem na realidade mapear a rede? O que desenvolvemos foi uma ideia para explorar um facto antigo, ou um facto conhecido, acerca de redes sociais, que diz o seguinte: Sabes que os teus amigos têm mais amigos que tu? Os teus amigos têm mais amigos que tu tens. E isto é conhecido como o paradoxo da amizade. Imaginem uma pessoa muito popular na rede social - como um anfitrião de uma festa que tem centenas de amigos - e um misantropo que tem apenas um amigo, e escolhes alguém ao acaso da população é mais provável que conheça o anfitrião da festa. E se eles nomearem o anfitrião como seu amigo, esse anfitrião tem uma centena de amigos, logo, tem mais amigos que eles. E isto, básicamente, é o que se conhece como paradoxo da amizade. Os amigos de pessoas escolhidas aleatóriamente têm um "grau" mais elevado e são mais centrais, que as próprias pessoas escolhidas.

And you can get an intuitive appreciation for this if you imagine just the people at the perimeter of the network. If you pick this person, the only friend they have to nominate is this person, who, by construction, must have at least two and typically more friends. And that happens at every peripheral node. And in fact, it happens throughout the network as you move in, everyone you pick, when they nominate a random -- when a random person nominates a friend of theirs, you move closer to the center of the network. So, we thought we would exploit this idea in order to study whether we could predict phenomena within networks. Because now, with this idea we can take a random sample of people, have them nominate their friends, those friends would be more central, and we could do this without having to map the network.

E podem obter uma análise intuitiva para isto se imaginarem apenas pessoas no perímetro da rede. Se escolhem esta pessoa, o único amigo que têm de nomear é esta pessoa, que, por construção, têm de ter pelo menos dois, e tipicamente mais amigos. E isso acontece em qualquer nó periférico. E de facto, isso acontece ao longo da rede à medida que se avança, qualquer pessoa que escolhes, quando nomeiam aleatóriamente... quando uma pessoa ao acaso nomeia um dos seus amigos, movem-se mais perto do centro da rede. Portanto, pensamos que iríamos explorar esta ideia para estudarmos se podíamos predizer fenómenos dentro das redes. Porque agora, com esta ideia, podemos pegar numa amostra aleatória de pessoas, pedir-lhes que nomeiem os seus amigos, esses amigos seriam mais centrais, e podíamos fazer isto sem ter de mapear a rede.

And we tested this idea with an outbreak of H1N1 flu at Harvard College in the fall and winter of 2009, just a few months ago. We took 1,300 randomly selected undergraduates, we had them nominate their friends, and we followed both the random students and their friends daily in time to see whether or not they had the flu epidemic. And we did this passively by looking at whether or not they'd gone to university health services. And also, we had them [actively] email us a couple of times a week. Exactly what we predicted happened. So the random group is in the red line. The epidemic in the friends group has shifted to the left, over here. And the difference in the two is 16 days. By monitoring the friends group, we could get 16 days advance warning of an impending epidemic in this human population.

E testamos esta idea com um surto da gripe H1N1 na Universidade de Harvard no outono e inverno de 2009, há apenas alguns meses. Pegámos em 1.300 estudantes escolhidos aleatóriamente, pedimos-lhes que nomeassem os seus amigos, e seguimos quer os estudantes escolhidos quer os seus amigos diáriamente ao longo do tempo para saber se tinham ou não a epidemia de gripe. E fizemos isto passivamente, sabendo se tinham recorrido ou não aos serviços de saúde da universidade. E também, nos enviaram [activamente] emails várias vezes por semana. Aconteceu exactamente o que prevíamos. O grupo aleatório é esta linha vermelha. A epidemia no grupo de amigos mudou para a esquerda, por aqui. E a diferença nos dois grupos é 16 dias. Monitorizando o grupo de amigos, pudemos obter 16 dias de aviso antecipado de uma epidemia eminente na população humana.

Now, in addition to that, if you were an analyst who was trying to study an epidemic or to predict the adoption of a product, for example, what you could do is you could pick a random sample of the population, also have them nominate their friends and follow the friends and follow both the randoms and the friends. Among the friends, the first evidence you saw of a blip above zero in adoption of the innovation, for example, would be evidence of an impending epidemic. Or you could see the first time the two curves diverged, as shown on the left. When did the randoms -- when did the friends take off and leave the randoms, and [when did] their curve start shifting? And that, as indicated by the white line, occurred 46 days before the peak of the epidemic. So this would be a technique whereby we could get more than a month-and-a-half warning about a flu epidemic in a particular population.

Ora, acrescentando a isso, se fossem um analista que tentava estudar uma epidemia ou prever a adopção de um produto, por exemplo, o que podiam fazer era escolher uma amostra aleatória da população, também pedir-lhes que nomeassem os seus amigos e seguirem os amigos, e seguirem ambos, os aleatórios e os amigos. Entre os amigos, a primeira evidência que vissem de um movimento acima de zero na adopção da inovação, por exemplo, seria a evidência de uma epidemia eminente. Ou podiam ver a primeira vez que as duas curvas divergissem, como mostrado à esquerda. Quando é que os aleatórios...quando é que os amigos descolam e deixam os aleatórios, e [quando] a sua curva começa a mover-se? E isso, como indicado pela linha branca, ocorreu 46 dias antes do pico da epidemia. Portanto isto pode ser uma técnica através da qual poderíamos obter um alerta de mais de um mês e meio acerca da epidemia da gripe numa população particular.

I should say that how far advanced a notice one might get about something depends on a host of factors. It could depend on the nature of the pathogen -- different pathogens, using this technique, you'd get different warning -- or other phenomena that are spreading, or frankly, on the structure of the human network. Now in our case, although it wasn't necessary, we could also actually map the network of the students.

Eu devo dizer que o quão antecipadamente podemos ter informação sobre algo depende de um conjunto de factores. Pode depender da natureza do patógeno -- diferentes patógeneos, utilizando esta técnica, obteriam diferentes alertas - ou outro fenómeno que esteja a alastrar, ou, francamente, sobre a estrutura da rede humana. Ora, no nosso caso, apesar de não ser necessário, podemos também mapear a rede dos estudantes.

So, this is a map of 714 students and their friendship ties. And in a minute now, I'm going to put this map into motion. We're going to take daily cuts through the network for 120 days. The red dots are going to be cases of the flu, and the yellow dots are going to be friends of the people with the flu. And the size of the dots is going to be proportional to how many of their friends have the flu. So bigger dots mean more of your friends have the flu. And if you look at this image -- here we are now in September the 13th -- you're going to see a few cases light up. You're going to see kind of blooming of the flu in the middle. Here we are on October the 19th. The slope of the epidemic curve is approaching now, in November. Bang, bang, bang, bang, bang -- you're going to see lots of blooming in the middle, and then you're going to see a sort of leveling off, fewer and fewer cases towards the end of December. And this type of a visualization can show that epidemics like this take root and affect central individuals first, before they affect others.

Assim, este é um mapa de 714 estudantes e as suas ligações de amizade. E dentro de um minuto, vou por este mapa em movimento. Iremos ver cortes diários ao longo da rede durante 120 dias. Os pontos vermelhos serão casos de gripe, e os pontos amarelos serão amigos de pessoas com gripe. e o tamanho dos pontos será proporcional a quantos dos seus amigos têm gripe. Pelo que pontos maiores significa que mais amigos têm gripe. E se olharem para esta imagem - aqui estamos a 13 de setembro - irão ver alguns casos a surgir. Irão ver como que um florescer da gripe no meio. Aqui estamos a 19 de outubro. O declive da curva da epidemia está a aproximar-se agora, em novembro. Bang, bang, bang, bang, bang, irão ver muito florescimento no meio, e depois irão ver uma espécie de estabilização, cada vez menos casos para o final de dezembro. E este tipo de visualização pode mostrar que epidemias como esta enraizam e afectam primeiro indivíduos centrais, antes que eles infectem outros.

Now, as I've been suggesting, this method is not restricted to germs, but actually to anything that spreads in populations. Information spreads in populations, norms can spread in populations, behaviors can spread in populations. And by behaviors, I can mean things like criminal behavior, or voting behavior, or health care behavior, like smoking, or vaccination, or product adoption, or other kinds of behaviors that relate to interpersonal influence. If I'm likely to do something that affects others around me, this technique can get early warning or early detection about the adoption within the population. The key thing is that for it to work, there has to be interpersonal influence. It cannot be because of some broadcast mechanism affecting everyone uniformly.

Bem, como tenho sugerido, este método não se restringe aos germes, mas na realidade a qualquer coisa que se espalhe em populações. Informação espalha-se em populações. Normas podem espalhar-se em populações. Comportamentos podem espalhar-se em populações. E por comportamentos, posso querer dizer comportamento criminal, ou comportamento de voto ou de cuidado com a saúde, como fumar ou vacinação, ou adopção de produto ou outros tipos de comportamentos que se relacionam com influência interpessoal. Se sou propenso a fazer algo que afecta outros à minha volta, esta técnica pode obter alerta precoce ou detecção precoce, acerca da adopção dentro da população. O factor chave, para que funcione, é têm que haver influência interpessoal. Não pode ser por causa de um mecanismo de transmissão que afecte toda a gente uniformemente.

Now the same insights can also be exploited -- with respect to networks -- can also be exploited in other ways, for example, in the use of targeting specific people for interventions. So, for example, most of you are probably familiar with the notion of herd immunity. So, if we have a population of a thousand people, and we want to make the population immune to a pathogen, we don't have to immunize every single person. If we immunize 960 of them, it's as if we had immunized a hundred [percent] of them. Because even if one or two of the non-immune people gets infected, there's no one for them to infect. They are surrounded by immunized people. So 96 percent is as good as 100 percent. Well, some other scientists have estimated what would happen if you took a 30 percent random sample of these 1000 people, 300 people and immunized them. Would you get any population-level immunity? And the answer is no. But if you took this 30 percent, these 300 people and had them nominate their friends and took the same number of vaccine doses and vaccinated the friends of the 300 -- the 300 friends -- you can get the same level of herd immunity as if you had vaccinated 96 percent of the population at a much greater efficiency, with a strict budget constraint.

Ora os mesmos conhecimentos também podem ser explorados - no que respeita às redes - também podem ser explorados de outros modos, por exemplo, utilizar para atingir pessoas específicas para intervenções. Por exemplo, provavelmente a maioria de vocês está familiarizado com a noção de imunidade de multidão. Bem, se temos uma população de mil pessoas, e queremos que essa população seja imune a um patógeno, não temos de imunizar todas as pessoas. Se imunizarmos 960, é como se tivéssemos imunizado cem [porcento]. Porque mesmo que uma ou duas das pessoas não-imunes seja infectada, não existe ninguém para elas infectarem. Estão rodeados por pessoas imunizadas. Logo 96 porcento é tão bom como 100 porcento. Bem, alguns cientistas estimaram o que aconteceria se pegássemos numa amostra aleatória de 30 porcento dessas 1000 pessoas, 300 pessoas e as imunizássemos. Obteríamos alguma imunidade populacional? E a resposta é, não. Mas se pegássemos nesses 30 porcento, essas 300 pessoas, e lhes pedíssemos que nomeassem os seus amigos e pegássemos no mesmo número de doses de vacinas e vacinássemos os amigos dos 300, os 300 amigos podiamos obter o mesmo nível de imunidade de multidão como se tivéssemos vacinado 96 porcento da população com muito maior eficiência, com um orçamental apertado.

And similar ideas can be used, for instance, to target distribution of things like bed nets in the developing world. If we could understand the structure of networks in villages, we could target to whom to give the interventions to foster these kinds of spreads. Or, frankly, for advertising with all kinds of products. If we could understand how to target, it could affect the efficiency of what we're trying to achieve. And in fact, we can use data from all kinds of sources nowadays [to do this].

e ideias similares podem ser utilizadas, por exemplo, para definir a distribuição de coisas como redes de cama nos países em desenvolvimento. Se pudermos perceber a estrutura das redes nas aldeias, podíamos definir a quem dar a ajuda para potenciar este tipo de propagações. Ou, francamente, para publicitar todo o tipo de produtos. Se pudermos compreender como atingir, podia afectar a eficácia do que estamos a tentar alcançar. E de facto, podemos utilizar informação de todo o tipo de fontes nos dias actuais [para fazer isto].

This is a map of eight million phone users in a European country. Every dot is a person, and every line represents a volume of calls between the people. And we can use such data, that's being passively obtained, to map these whole countries and understand who is located where within the network. Without actually having to query them at all, we can get this kind of a structural insight. And other sources of information, as you're no doubt aware are available about such features, from email interactions, online interactions, online social networks and so forth. And in fact, we are in the era of what I would call "massive-passive" data collection efforts. They're all kinds of ways we can use massively collected data to create sensor networks to follow the population, understand what's happening in the population, and intervene in the population for the better. Because these new technologies tell us not just who is talking to whom, but where everyone is, and what they're thinking based on what they're uploading on the Internet, and what they're buying based on their purchases. And all this administrative data can be pulled together and processed to understand human behavior in a way we never could before.

Isto é um mapa de oito milhões de utilizadores de telefone num país europeu. Cada ponto é uma pessoa e cada linha representa o volume de chamadas entre as pessoas. E podemos utilizar esta informação, que se está a obter passivamente, para mapear todos estes países e perceber quem está localizado onde dentro da rede. Sem, na realidade, termos de os questionar a todos, podemos obter uma espécie de visão estrutural. E outras fontes de informação, que não tenho dúvida estão a par, estão disponíveis acerca dessas características, desde interacções por email, interacções online, redes sociais online e por aí adiante. E de facto, estamos numa era do que chamaria esforços de recolha de informação "massivo-passivos". São todas as formas que podemos utilizar para recolher informação massivamente para criar redes sensoras para seguir a população, perceber o que está a acontecer na população, e intervir na população para melhorar. Porque estas novas tecnologias dizem-nos não apenas quem fala com quem, mas onde está toda a gente, e o que estão a pensar baseado no que colocam na internet, e o que estão a comprar baseado nas suas aquisições. E toda esta informação administrativa pode ser colocada junta e processada para perceber o comportamento humano de uma forma que nunca antes conseguimos.

So, for example, we could use truckers' purchases of fuel. So the truckers are just going about their business, and they're buying fuel. And we see a blip up in the truckers' purchases of fuel, and we know that a recession is about to end. Or we can monitor the velocity with which people are moving with their phones on a highway, and the phone company can see, as the velocity is slowing down, that there's a traffic jam. And they can feed that information back to their subscribers, but only to their subscribers on the same highway located behind the traffic jam! Or we can monitor doctors prescribing behaviors, passively, and see how the diffusion of innovation with pharmaceuticals occurs within [networks of] doctors. Or again, we can monitor purchasing behavior in people and watch how these types of phenomena can diffuse within human populations.

Por exemplo, podiamos utilizar as compras de combustível de camionistas. Os camionistas andam no seu negócio, e compram combustível. E vemos um movimento positivo nas compras de combustível dos camionistas e sabemos que a recessão está a acabar. Ou podemos monitorizar a velocidade com que as pessoas se movem com os seus telemóveis na auto-estrada, e a empresa de telemóveis pode ver, quando a velocidade abranda, que existe um engarrafamento. E podem fazer chegar essa informação aos seus clientes, mas apenas aos clientes na mesma auto-estrada localizados antes do engarrafamento! Ou podemos monitorizar os comportamentos de prescrição dos médicos, passivamente, e ver como a difusão de inovação com fármacos ocorre entre [a rede de] médicos. Ou mais uma vez, podíamos monitorizar o comportamento de compras nas pessoas, e verificar como estes tipos de fenómenos se podem difundir entre a população humana.

And there are three ways, I think, that these massive-passive data can be used. One is fully passive, like I just described -- as in, for instance, the trucker example, where we don't actually intervene in the population in any way. One is quasi-active, like the flu example I gave, where we get some people to nominate their friends and then passively monitor their friends -- do they have the flu, or not? -- and then get warning. Or another example would be, if you're a phone company, you figure out who's central in the network and you ask those people, "Look, will you just text us your fever every day? Just text us your temperature." And collect vast amounts of information about people's temperature, but from centrally located individuals. And be able, on a large scale, to monitor an impending epidemic with very minimal input from people. Or, finally, it can be more fully active -- as I know subsequent speakers will also talk about today -- where people might globally participate in wikis, or photographing, or monitoring elections, and upload information in a way that allows us to pool information in order to understand social processes and social phenomena.

E existem três formas, penso eu, em que esta informação massiva-passiva pode ser utilizada. Uma é completamente passiva, como acabei de descrever - como, por exemplo, no caso do camionista, onde não intervimos na população de nenhuma forma. Uma é quasi-activa, como o exemplo da gripe que dei, onde pedimos a algumas pessoas que nomeiem os seus amigos e de depois, passivamente, monitorizamos os seus amigos - têm gripe ou não? - e então obtemos o alerta. Ou outro exemplo podia ser, se são uma empresa telefónica, perceberem quem é central na rede, e pedir a essas pessoas, "Olhe, podia-nos enviar uma mensagem com a sua febre todos dias? Envie-nos apenas uma mensagem com a sua temperatura." E recolher grandes quantidades de informação acerca da temperatura das pessoas, mas de indivíduos localizados centralmente. E ser capaz, em grande escala, de monitorizar uma epidemia eminente com muito pouca colaboração das pessoas. Ou, finalmente, pode ser completamente activa - como sei que os oradores seguintes irão falar também hoje - onde as pessoas podem globalmente participar em wikis, ou fotografando ou monitorizando eleições, e carregar informação de forma a permitir-nos procurar informação de forma a percebermos processos sociais e fenómenos sociais.

In fact, the availability of these data, I think, heralds a kind of new era of what I and others would like to call "computational social science." It's sort of like when Galileo invented -- or, didn't invent -- came to use a telescope and could see the heavens in a new way, or Leeuwenhoek became aware of the microscope -- or actually invented -- and could see biology in a new way. But now we have access to these kinds of data that allow us to understand social processes and social phenomena in an entirely new way that was never before possible. And with this science, we can understand how exactly the whole comes to be greater than the sum of its parts. And actually, we can use these insights to improve society and improve human well-being.

De facto, a disponibilidade desta informação, penso eu, abre uma espécie de nova era do que eu e outros gostaríamos de chamar "ciência social computacional". É como quando Galileu inventou - ou, não inventou - passou a usar um telescópio e pôde ver os céus duma nova forma, ou Leeuwenhoek tomou conhecimento do microscópio - ou na realidade inventou-o - e pôde ver a biologia duma nova maneira. Mas agora temos acesso a estas formas de informação que nos permitem perceber os processos sociais e os fenómenos sociais duma forma inteiramente nova que nunca tinha sido possível. E com esta ciência, podemos perceber exactamente como o todo se torna maior do que a soma das suas partes. E na realidade, podemos utilizar estes conhecimentos para melhorar a sociedade e melhorar o bem-estar humano.

Thank you.

Obrigado.

Thank you.

Obrigado.

Nicholas Christakis: How social networks predict epidemics

Nicholas Christakis: How social networks predict epidemics

Related talks

Nicholas Christakis: The hidden influence of social networks

Dan Dennett: Dangerous memes

Laurie Garrett: Lessons from the 1918 flu

Gary Slutkin: Let's treat violence like a contagious disease

Andreas Raptopoulos: No roads? There's a drone for that

Eric Berlow and Sean Gourley: Mapping ideas worth spreading

Related talks

Nicholas Christakis: The hidden influence of social networks

Dan Dennett: Dangerous memes

Laurie Garrett: Lessons from the 1918 flu

Gary Slutkin: Let's treat violence like a contagious disease

Andreas Raptopoulos: No roads? There's a drone for that

Eric Berlow and Sean Gourley: Mapping ideas worth spreading