Nicholas Christakis: How social networks predict epidemics

For the last 10 years, I've been spending my time trying to figure out how and why human beings assemble themselves into social networks. And the kind of social network I'm talking about is not the recent online variety, but rather, the kind of social networks that human beings have been assembling for hundreds of thousands of years, ever since we emerged from the African savannah. So, I form friendships and co-worker and sibling and relative relationships with other people who in turn have similar relationships with other people. And this spreads on out endlessly into a distance. And you get a network that looks like this. Every dot is a person. Every line between them is a relationship between two people -- different kinds of relationships. And you can get this kind of vast fabric of humanity, in which we're all embedded.

Nos últimos 10 anos, eu tenho investido o meu tempo tentando descobrir como e por que os seres humanos se reúnem em redes sociais. E o tipo de rede social sobre a qual eu estou falando não é a recente variedade online, pelo contrário, é sobre o tipo de redes sociais em que os seres humanos têm se reunido por centenas de milhares de anos, desde que emergimos da savana africana. Então, eu tenho relações de amizade, de colega de trabalho, de irmão e relações de família com outras pessoas que, por sua vez, têm relações similares com outras pessoas. E isso se espalha em uma distância infinita. E você tem uma rede parecida com essa. Cada ponto é uma pessoa. Cada linha entre eles é uma relação entre duas pessoas -- diferentes tipos de relações. E você pode obter esse vasto tipo de rede da humanidade. em que nós todos estamos envolvidos.

And my colleague, James Fowler and I have been studying for quite sometime what are the mathematical, social, biological and psychological rules that govern how these networks are assembled and what are the similar rules that govern how they operate, how they affect our lives. But recently, we've been wondering whether it might be possible to take advantage of this insight, to actually find ways to improve the world, to do something better, to actually fix things, not just understand things. So one of the first things we thought we would tackle would be how we go about predicting epidemics.

O meu colega, James Fowler, e eu temos estudado há bastante tempo quais são as regras matemáticas, sociais, biológicas e psicológicas que governam como essas redes são organizadas e quais são as regras similares que governam como elas operam, como elas afetam as nossas vidas. E recentemente, nós temos questionado se seria possível tirar vantagem desse entendimento, para realmente encontrar formas de melhorar o mundo, para fazer algo melhor, para, na verdade, corrigir as coisas, não apenas entendê-las. Então uma das primeiras coisas que nós pensamos que atacaríamos seria sobre como prever epidemias.

And the current state of the art in predicting an epidemic -- if you're the CDC or some other national body -- is to sit in the middle where you are and collect data from physicians and laboratories in the field that report the prevalence or the incidence of certain conditions. So, so and so patients have been diagnosed with something, or other patients have been diagnosed, and all these data are fed into a central repository, with some delay. And if everything goes smoothly, one to two weeks from now you'll know where the epidemic was today. And actually, about a year or so ago, there was this promulgation of the idea of Google Flu Trends, with respect to the flu, where by looking at people's searching behavior today, we could know where the flu -- what the status of the epidemic was today, what's the prevalence of the epidemic today.

E o atual estado da arte em predizer uma epidemia -- se você é o CDC (centro de controle de doenças) ou algum outro órgão nacional -- é sentar no meio de onde você está e coletar dados de médicos e laboratórios da área que relatam a prevalência ou a incidência de certas condições. Tal, tal e tal pacientes têm sido diagnosticados com alguma coisa [por aqui], ou outros pacientes têm sido diagnosticados [ali], e todos esses dados alimentam um repositório central com algum atraso. E se tudo correr bem, em uma ou duas semanas, você saberá onde a epidemia estava hoje. Na verdade, cerca de um ano ou mais atrás, houve esse tipo de propagação dessa noção de Tendências de Gripe no Google, com relação à gripe, onde por pesquisar o comportamento de busca das pessoas hoje, nós poderíamos saber aonde a gripe ... qual o status da epidemia hoje, qual é a prevalência da epidemia hoje.

But what I'd like to show you today is a means by which we might get not just rapid warning about an epidemic, but also actually early detection of an epidemic. And, in fact, this idea can be used not just to predict epidemics of germs, but also to predict epidemics of all sorts of kinds. For example, anything that spreads by a form of social contagion could be understood in this way, from abstract ideas on the left like patriotism, or altruism, or religion to practices like dieting behavior, or book purchasing, or drinking, or bicycle-helmet [and] other safety practices, or products that people might buy, purchases of electronic goods, anything in which there's kind of an interpersonal spread. A kind of a diffusion of innovation could be understood and predicted by the mechanism I'm going to show you now.

Mas o que eu gostaria de mostrar hoje é um meio pelo qual nós podemos chegar não apenas a um alerta rápido de uma epidemia, mas também à detecção precoce de uma epidemia. E de fato, essa ideia pode ser usada não apenas para predizer epidemias de germes, mas também para predizer epidemias de todos os tipos. Por exemplo, qualquer coisa que se espalha pela forma de contágio social poderia ser entendida dessa forma, desde idéias abstratas sobre a esquerda, como patriotismo, ou altruísmo, ou religião, até práticas como comportamento alimentar ou compras de livros ou beber ou capacete de bicicleta e outras práticas de segurança, ou produtos que pessoas podem comprar, compra de produtos eletrônicos, qualquer coisa em que há um tipo de propagação interpessoal. Um tipo de difusão de inovação poderia ser entendida e prevista pelo mecanismo que eu vou mostrar agora.

So, as all of you probably know, the classic way of thinking about this is the diffusion-of-innovation, or the adoption curve. So here on the Y-axis, we have the percent of the people affected, and on the X-axis, we have time. And at the very beginning, not too many people are affected, and you get this classic sigmoidal, or S-shaped, curve. And the reason for this shape is that at the very beginning, let's say one or two people are infected, or affected by the thing and then they affect, or infect, two people, who in turn affect four, eight, 16 and so forth, and you get the epidemic growth phase of the curve. And eventually, you saturate the population. There are fewer and fewer people who are still available that you might infect, and then you get the plateau of the curve, and you get this classic sigmoidal curve. And this holds for germs, ideas, product adoption, behaviors, and the like. But things don't just diffuse in human populations at random. They actually diffuse through networks. Because, as I said, we live our lives in networks, and these networks have a particular kind of a structure.

Como todos vocês provavelmente sabem, a clássica forma de pensar sobre isso é a difusão da inovação ou a "curva de adoção." Aqui no eixo Y, nós temos o percentual de pessoas afetadas, e no eixo X, nós temos o tempo. E bem no começo, não muitas pessoas são afetadas, e você tem essa clássica sigmoide. ou curva em forma de S. E a razão para essa forma é que bem no começo, vamos dizer uma ou duas pessoas são afetadas, ou infectadas pela coisa, e então eles afetam ou infectam duas pessoas que, por sua vez, afetam quatro, oito, 16 e assim por diante, e você tem a fase de crescimento epidêmico da curva. E finalmente, você satura a população. Há menos e menos pessoas que ainda estão disponíveis para você infectar, e então você chega ao platô da curva, e você tem essa clássica curva sigmoide. E isso vale para os germes, idéias adoção de produtos, comportamentos e assim por diante. Mas as coisas não se difundem em população humanas aleatoriamente. Elas, na verdade, se difundem através de redes. Por que, como eu disse, nós vivemos nossas vidas em redes, e essas redes têm um tipo particular de estrutura.

Now if you look at a network like this -- this is 105 people. And the lines represent -- the dots are the people, and the lines represent friendship relationships. You might see that people occupy different locations within the network. And there are different kinds of relationships between the people. You could have friendship relationships, sibling relationships, spousal relationships, co-worker relationships, neighbor relationships and the like. And different sorts of things spread across different sorts of ties. For instance, sexually transmitted diseases will spread across sexual ties. Or, for instance, people's smoking behavior might be influenced by their friends. Or their altruistic or their charitable giving behavior might be influenced by their coworkers, or by their neighbors. But not all positions in the network are the same.

Agora se você olhar para uma rede como essa ... Aqui há 105 pessoas. E as linhas representam ... os pontos são as pessoas, e as linhas representam as relações de amizade. Você pode ver que as pessoas ocupam diferentes localizações dentro da rede. E há diferentes tipos de relações entre as pessoas. Você poderia ter relações de amizade, relações de irmãos, relações conjugais, relações de trabalho, relações de vizinhos e assim por diante. E os diferentes tipos de coisas se espalham através de diferentes tipos de laços. Por exemplo, doenças sexualmente transmissíveis se espalharão através de laços sexuais. Ou por exemplo, o comportamento de fumar pode ser influenciado por amigos. Ou seus comportamentos atruístas ou de caridade podem ser influenciados pelos seus colegas de trabalho, ou por seus vizinhos. Mas nem todas as posições na rede são as mesmas.

So if you look at this, you might immediately grasp that different people have different numbers of connections. Some people have one connection, some have two, some have six, some have 10 connections. And this is called the "degree" of a node, or the number of connections that a node has. But in addition, there's something else. So, if you look at nodes A and B, they both have six connections. But if you can see this image [of the network] from a bird's eye view, you can appreciate that there's something very different about nodes A and B. So, let me ask you this -- I can cultivate this intuition by asking a question -- who would you rather be if a deadly germ was spreading through the network, A or B? (Audience: B.) Nicholas Christakis: B, it's obvious. B is located on the edge of the network. Now, who would you rather be if a juicy piece of gossip were spreading through the network? A. And you have an immediate appreciation that A is going to be more likely to get the thing that's spreading and to get it sooner by virtue of their structural location within the network. A, in fact, is more central, and this can be formalized mathematically. So, if we want to track something that was spreading through a network, what we ideally would like to do is to set up sensors on the central individuals within the network, including node A, monitor those people that are right there in the middle of the network, and somehow get an early detection of whatever it is that is spreading through the network.

Se você olhar para isso, você pode imediatamente compreender que diferentes pessoas têm diferentes números de conexões. Algumas pessoas têm uma conexão, algumas têm duas, algumas têm seis, algumas têm 10 conexões. E isso é chamado de grau de enredo, ou o número de conexões de um nó. Mas além disso, há uma coisa também. Então, se você olha para os nós A e B, ambos têm seis conexões. Mas se você olhar para essa imagem [da rede] do ponto de vista de um pássaro, você pode notar que há alguma coisa muito diferente sobre os nós A e B. Deixa eu perguntar a vocês isso -- eu posso cultivar essa intuição fazendo uma questão -- quem vocês gostariam de ser se um germe mortal estivesse se espalhando através da rede, A ou B? (Platéia: B) Nicholas Christakis: B, é óbvio. B está localizado na borda da rede. Agora, quem vocês gostariam de ser se fofocas estivessem se espalhando através da rede? A. E você tem uma apreciação imediata que A é mais provável de pegar uma coisa que está se espalhando e pegá-la mais rápido em função de sua localização estrutural na rede. A, de fato, é mais central, e isso pode ser formalizado matematicamente. Se nós quisermos monitorar alguma coisa que estivesse se espalhando através de uma rede, o que nós idealmente gostaríamos de fazer é configurar os sensores sobre os indivíduos centrais dentro da rede incluindo o nó A, monitorar aquelas pessoas que estão bem no meio da rede, e de alguma forma conseguir uma detecção precoce de tudo o que estiver se espalhando pela rede.

So if you saw them contract a germ or a piece of information, you would know that, soon enough, everybody was about to contract this germ or this piece of information. And this would be much better than monitoring six randomly chosen people, without reference to the structure of the population. And in fact, if you could do that, what you would see is something like this. On the left-hand panel, again, we have the S-shaped curve of adoption. In the dotted red line, we show what the adoption would be in the random people, and in the left-hand line, shifted to the left, we show what the adoption would be in the central individuals within the network. On the Y-axis is the cumulative instances of contagion, and on the X-axis is the time. And on the right-hand side, we show the same data, but here with daily incidence. And what we show here is -- like, here -- very few people are affected, more and more and more and up to here, and here's the peak of the epidemic. But shifted to the left is what's occurring in the central individuals. And this difference in time between the two is the early detection, the early warning we can get, about an impending epidemic in the human population.

Se vocês vissem eles contraírem um germe ou alguma informação, vocês saberiam que, breve o suficiente, todos estariam próximos de contrair esse germe ou essa informação. E isso seria muito melhor que monitorar seis pessoas aleatoriamente, sem referência à estrutura da população. E de fato, se você pudesse fazer isso, o que você veria é alguma coisa como isso, no painel do lado esquerdo, de novo, temos a curva de adoção em forma de S. Na linha vermelha pontilhada, nós mostramos o que seria a adoção em pessoas randômicas, e na linha esquerda deslocada para a esquerda, nós mostramos o que a adoção seria nos indivíduos centrais à rede. No eixo Y, estão os casos acumulados de contágio, e no eixo X está o tempo. E no lado direito, nós mostramos os mesmos dados, mas aqui com incidências diárias. E o que nós mostramos aqui é -- como, aqui -- muito poucas pessoas são afetadas, mais e mais e mais até aqui, e aqui está o pico da epidemia. Mas deslocada para a esquerda nos indivíduos centrais. E essa diferença em tempo entre os dois é a detecção precoce, o alarme precoce que nós podemos obter acerca de uma epidemia iminente na população humana.

The problem, however, is that mapping human social networks is not always possible. It can be expensive, not feasible, unethical, or, frankly, just not possible to do such a thing. So, how can we figure out who the central people are in a network without actually mapping the network? What we came up with was an idea to exploit an old fact, or a known fact, about social networks, which goes like this: Do you know that your friends have more friends than you do? Your friends have more friends than you do, and this is known as the friendship paradox. Imagine a very popular person in the social network -- like a party host who has hundreds of friends -- and a misanthrope who has just one friend, and you pick someone at random from the population; they were much more likely to know the party host. And if they nominate the party host as their friend, that party host has a hundred friends, therefore, has more friends than they do. And this, in essence, is what's known as the friendship paradox. The friends of randomly chosen people have higher degree, and are more central than the random people themselves.

O problema, porém, é que mapear redes sociais humanas nem sempre é possível. Isso pode ser caro, [muito difícil], antiético, ou, francamente, simplesmente poderia não ser possível fazer uma coisa dessas. Então, como podemos descobrir quem são as pessoas centrais na rede sem, na verdade, mapear a rede? O que surgiu foi uma idéia para explorar um fato antigo, ou um conhecido fato sobre redes sociais, que é o seguinte: você sabe que seus amigos têm mais amigos do que você? Seus amigos têm mais amigos do que você. Esse é o conhecido paradoxo da amizade. Imagine uma pessoa muito popular na rede social -- como o anfitrião de uma festa com centenas de amigos -- e um misantropo que tem apenas um amigo, e você pega alguém aleatoriamente da população; é muito mais provável que eles conheçam o anfitrião da festa. E se eles mencionarem o anfitrião da festa como um amigo, esse anfittrião tem uma centena de amigos, portanto, tem mais amigos que eles. E isso, em essência, é o que se conhece por paradoxo da amizade. Os amigos de uma pessoa randomicamente escolhida têm maior grau e maior centralidade que as próprias pessoas randomicamente escolhidas.

And you can get an intuitive appreciation for this if you imagine just the people at the perimeter of the network. If you pick this person, the only friend they have to nominate is this person, who, by construction, must have at least two and typically more friends. And that happens at every peripheral node. And in fact, it happens throughout the network as you move in, everyone you pick, when they nominate a random -- when a random person nominates a friend of theirs, you move closer to the center of the network. So, we thought we would exploit this idea in order to study whether we could predict phenomena within networks. Because now, with this idea we can take a random sample of people, have them nominate their friends, those friends would be more central, and we could do this without having to map the network.

E você pode obter uma apreciação intuitiva a partir disso se você imagina apenas as pessoas no perímetro da rede. Se você pega essa pessoa, o único amigo que eles têm a mencionar é essa pessoa, que, pela estrutura, deve ter pelo menos dois e tipicamente mais amigos. E isso acontece com todos os nós periféricos. E de fato, isso acontece através da rede quando você se move para dentro, cada uma que você pega, quando eles mencionarem um randômico ... quando uma pessoa randômica mencionar um amigo seu, você se move para mais perto do centro da rede. Então, nós pensamos que nós exploraríamos essa idéia a fim de estudar se nós poderíamos prever fenômenos dentro das redes. Em função disso, com essa idéia, nós podemos pegar uma amostra randômica de pessoas, que indiquem os seus amigos, esses amigos seriam mais centrais, e nós poderíamos fazer isso sem ter mapeado a rede.

And we tested this idea with an outbreak of H1N1 flu at Harvard College in the fall and winter of 2009, just a few months ago. We took 1,300 randomly selected undergraduates, we had them nominate their friends, and we followed both the random students and their friends daily in time to see whether or not they had the flu epidemic. And we did this passively by looking at whether or not they'd gone to university health services. And also, we had them [actively] email us a couple of times a week. Exactly what we predicted happened. So the random group is in the red line. The epidemic in the friends group has shifted to the left, over here. And the difference in the two is 16 days. By monitoring the friends group, we could get 16 days advance warning of an impending epidemic in this human population.

E nós testamos essa idéia com um surto de gripe H1N1 na faculdade de Harvard no outono e no inverno de 2009, apenas alguns meses atrás. Nós pegamos 1300 estudantes de graduação aleatoriamente selecionados, que indicaram seus amigos, e nós seguimos ambos os estudantes aleatoriamente escolhidos e seus amigos diariamente em tempo de ver se eles tinham ou não a gripe epidêmica. E nós fizemos isso passivamente por olhar se eles procuravam ou não os serviços de saúde da universidade. E também nós enviamos e-mails [ativamente] a eles algumas vezes por semana. Exatamente o que nós prevíamos aconteceu. O grupo randômico está na linha vermelha. A epidemia no grupo de amigos está deslocada para a esquerda, aqui. E a diferença entre os dois é de 16 dias. Monitorando o grupo de amigos, nós poderíamos alertar com 16 dias de antecedência sobre uma iminente epidemia nessa população humana.

Now, in addition to that, if you were an analyst who was trying to study an epidemic or to predict the adoption of a product, for example, what you could do is you could pick a random sample of the population, also have them nominate their friends and follow the friends and follow both the randoms and the friends. Among the friends, the first evidence you saw of a blip above zero in adoption of the innovation, for example, would be evidence of an impending epidemic. Or you could see the first time the two curves diverged, as shown on the left. When did the randoms -- when did the friends take off and leave the randoms, and [when did] their curve start shifting? And that, as indicated by the white line, occurred 46 days before the peak of the epidemic. So this would be a technique whereby we could get more than a month-and-a-half warning about a flu epidemic in a particular population.

Agora, além disso, se você fosse um analista que estivesse tentando estudar uma epidemia ou prever a adoção de um produto, por exemplo, o que você poderia fazer é pegar uma amostra randômica da população, que também nomeasse os seus amigos e seguir esses amigos, e seguir ambos os randômicos e amigos. Entre os amigos, a primeira evidência de um pontinho acima de zero em adoção ou inovação, por exemplo, seria evidência de uma epidemia iminente. Ou você poderia observar o primeiro momento em que as duas curvas divergem, como mostrado na esquerda. Quando os randômicos... quando os amigos se deslocaram e deixaram os randômicos, e sua curva começou a mudar? Isso, como indicado pela linha branca, ocorreu 46 dias antes do pico da epidemia. Então essa seria uma técnica pela qual nós poderíamos obter um alerta de mais de um mês e meio sobre a epidemia de gripe em uma população particular.

I should say that how far advanced a notice one might get about something depends on a host of factors. It could depend on the nature of the pathogen -- different pathogens, using this technique, you'd get different warning -- or other phenomena that are spreading, or frankly, on the structure of the human network. Now in our case, although it wasn't necessary, we could also actually map the network of the students.

E deveria dizer que o quão avançado um aviso pode ser sobre alguma coisa depende de uma série de fatores. Isso poderia depender da natureza do patógeno -- diferentes patógenos, usando essa técnica, você obteria diferentes alertas -- ou outro fenômeno que está se espalhando, ou, francamente, da estrutura da rede humana. Agora, no nosso caso, embora não tenha sido necessário, nós poderíamos também mapear a rede de estudantes.

So, this is a map of 714 students and their friendship ties. And in a minute now, I'm going to put this map into motion. We're going to take daily cuts through the network for 120 days. The red dots are going to be cases of the flu, and the yellow dots are going to be friends of the people with the flu. And the size of the dots is going to be proportional to how many of their friends have the flu. So bigger dots mean more of your friends have the flu. And if you look at this image -- here we are now in September the 13th -- you're going to see a few cases light up. You're going to see kind of blooming of the flu in the middle. Here we are on October the 19th. The slope of the epidemic curve is approaching now, in November. Bang, bang, bang, bang, bang -- you're going to see lots of blooming in the middle, and then you're going to see a sort of leveling off, fewer and fewer cases towards the end of December. And this type of a visualization can show that epidemics like this take root and affect central individuals first, before they affect others.

Então, esse é um mapa de 714 estudantes e seus laços de amizade. E em um minuto agora, eu vou colocar esse mapa em movimento. Nós pegaremos cortes diários através da rede por 120 dias. Os pontos vermelhos são os casos de gripe, e os amarelos são os amigos das pessoas com gripe. E o tamanho desses pontos é proporcional a quantos de seus amigos têm gripe. Quanto maiores os pontos, mais de seus amigos têm gripe. E se você olhar para essa imagem -- aqui nós estamos em 13 de setembro -- você verá alguns casos surgirem. Você verá uma espécie de florescência de gripe no meio. Aqui nós estamos em 19 de outubro. A inclinação da curva epidêmica está se aproximando agora, em novembro. Bang, bang, bang, bang, bang, você verá um monte de florescências no meio, e você verá uma espécie de nivelamento, cada vez menos casos no fim de dezembro. E esse tipo de visualização mostra que epidemias como essa se enraizam e afetam indivíduos centrais primeiro, antes de afetar os outros indivíduos.

Now, as I've been suggesting, this method is not restricted to germs, but actually to anything that spreads in populations. Information spreads in populations, norms can spread in populations, behaviors can spread in populations. And by behaviors, I can mean things like criminal behavior, or voting behavior, or health care behavior, like smoking, or vaccination, or product adoption, or other kinds of behaviors that relate to interpersonal influence. If I'm likely to do something that affects others around me, this technique can get early warning or early detection about the adoption within the population. The key thing is that for it to work, there has to be interpersonal influence. It cannot be because of some broadcast mechanism affecting everyone uniformly.

Agora, como eu tenho sugerido, esse método não é restrito aos germes, mas, na verdade, serve para qualquer coisa que se espalha nas populações. Informação se espalha em populações. Normas podem se espalhar em populações. Comportamentos podem se espalhar em populações. E por comportamento, eu quero dizer algo como comportamento criminoso, ou comportamento de voto, ou comportamento de cuidado com a saúde, como fumar, ou vacinação, ou adoção de produtos, ou outros tipos de comportamentos relacionados à influência interpessoal. Se eu vou provavelmente fazer algo que afeta os outros em minha volta, essa técnica pode obter um alerta precoce, ou detecção precoce, sobre a adoção dentro da população. O aspecto chave é que, para isso funcionar, deve haver influência interpessoal. Não pode acontecer em função de um mecanismo de difusão afetando todos uniformemente.

Now the same insights can also be exploited -- with respect to networks -- can also be exploited in other ways, for example, in the use of targeting specific people for interventions. So, for example, most of you are probably familiar with the notion of herd immunity. So, if we have a population of a thousand people, and we want to make the population immune to a pathogen, we don't have to immunize every single person. If we immunize 960 of them, it's as if we had immunized a hundred [percent] of them. Because even if one or two of the non-immune people gets infected, there's no one for them to infect. They are surrounded by immunized people. So 96 percent is as good as 100 percent. Well, some other scientists have estimated what would happen if you took a 30 percent random sample of these 1000 people, 300 people and immunized them. Would you get any population-level immunity? And the answer is no. But if you took this 30 percent, these 300 people and had them nominate their friends and took the same number of vaccine doses and vaccinated the friends of the 300 -- the 300 friends -- you can get the same level of herd immunity as if you had vaccinated 96 percent of the population at a much greater efficiency, with a strict budget constraint.

Os mesmos insights podem também ser explorados -- com respeito a redes -- podem também ser explorados de outras formas, por exemplo, no uso de alvos, pessoas específicas, para intervenções. Por exemplo, a maioria de vocês está provavelmente familiarizado com a noção de imunidade de rebanho. Se nós temos uma população de mil pessoas, e nós queremos torná-la imune a um patógeno, nós não temos que imunizar cada uma das pessoas. Se nós imunizarmos 960 delas, será como se nós tivessemos imunizado cem por cento delas. Por que mesmo se uma ou duas pessoas não imunizadas forem infectadas, não há ninguém para elas infectarem. Elas estão cercadas por pessoas imunizadas. Então 96 por cento é tão bom quanto 100 por cento. Bem, alguns outros cientistas têm estimado o que aconteceria se você pegasse uma amostra randômica de 30 por cento dessas 1000 pessoas, 300 pessoas e as imunizasse. Você obteria uma imunização em nível de população? E a resposta é não. Mas se você pegasse esses 30 por cento, essas 300 pessoas, e elas indicassem os seus amigos, e você pegasse o mesmo número de doses de vacina e vacinasse os amigos dos 300, os 300 amigos, você teria o mesmo nível de imunidade de rebanho como se você tivesse vacinado 96 por cento da população, com muito mais eficiência, com uma restrição orçamentária.

And similar ideas can be used, for instance, to target distribution of things like bed nets in the developing world. If we could understand the structure of networks in villages, we could target to whom to give the interventions to foster these kinds of spreads. Or, frankly, for advertising with all kinds of products. If we could understand how to target, it could affect the efficiency of what we're trying to achieve. And in fact, we can use data from all kinds of sources nowadays [to do this].

E idéias semelhantes podem ser usadas, por exemplo, para atingir a meta de distribuição de mosqueteiros no mundo em desenvolvimento. Se você pudesse entender a estrutura das redes nas aldeias, você poderia atingir o alvo a quem dar essas intervenções para promover esse tipo de propagação. Ou, francamente, para publicidade de todo o tipo de produtos. Se nós pudéssemos entender atingir o alvo, isso poderia afetar a eficiência do que nós estamos tentanto atingir. E de fato, nós podemos usar esses dados de todos os tipos de fontes hoje em dia [para fazer isso].

This is a map of eight million phone users in a European country. Every dot is a person, and every line represents a volume of calls between the people. And we can use such data, that's being passively obtained, to map these whole countries and understand who is located where within the network. Without actually having to query them at all, we can get this kind of a structural insight. And other sources of information, as you're no doubt aware are available about such features, from email interactions, online interactions, online social networks and so forth. And in fact, we are in the era of what I would call "massive-passive" data collection efforts. They're all kinds of ways we can use massively collected data to create sensor networks to follow the population, understand what's happening in the population, and intervene in the population for the better. Because these new technologies tell us not just who is talking to whom, but where everyone is, and what they're thinking based on what they're uploading on the Internet, and what they're buying based on their purchases. And all this administrative data can be pulled together and processed to understand human behavior in a way we never could before.

Esse é um mapa de oito milhões de usuários de telefone em um país europeu. Cada ponto é uma pessoa e cada linha representa um volume de chamadas entre as pessoas. E nós podemos usar dados como esses, que são passivamente obtidos, para mapear esses países e entender quem está localizado dentro da rede. Sem, na verdade, ter de consultar a todos, nós podemos obter esse tipo de compreensão estrutural. E outras fontes de informação, como vocês conhecem, sem dúvida são disponíveis através de informações sobre interações por e-mail, interações online, redes sociais online e assim por diante. E de fato, nós estamos na era do que eu poderia chamar esforços massivos e passivos de coleta de dados. Elas são as formas pelas quais nós podemos coletar dados massivamente para criar redes de sensores para seguir a população, entender o que está acontecendo na populção e intervir na população para o melhor. Por que essas novas tecnologias nos dizem, não apenas quem está falando com quem, mas onde cada um está e o que eles estão pensando baseados no que eles uploading na internet, e o que eles estão comprando baseados na suas compras. E todos esses dados administrativos podem ser obtidos juntos e processados para entender o comportamento humano de uma forma que nós nunca conseguimos antes.

So, for example, we could use truckers' purchases of fuel. So the truckers are just going about their business, and they're buying fuel. And we see a blip up in the truckers' purchases of fuel, and we know that a recession is about to end. Or we can monitor the velocity with which people are moving with their phones on a highway, and the phone company can see, as the velocity is slowing down, that there's a traffic jam. And they can feed that information back to their subscribers, but only to their subscribers on the same highway located behind the traffic jam! Or we can monitor doctors prescribing behaviors, passively, and see how the diffusion of innovation with pharmaceuticals occurs within [networks of] doctors. Or again, we can monitor purchasing behavior in people and watch how these types of phenomena can diffuse within human populations.

Por exemplo, nós poderíamos usar as compras de combustível dos caminhoneiros. Os caminhoneiros estão apenas fazendo o seu negócio, e eles estão comprando combustível. E nós vemos um pontinho nas compras de combustíveis dos caminhoneiros, e nós sabemos que a recessão está próxima do fim. Ou nós podemos monitorar a velocidade com que cada pessoa está se movimentando com seu telefone em uma rodovia, e a companhia telefônica pode ver quando a velocidade está reduzindo, que há um congestionamento. E eles podem alimentar os seus assinantes com essa informação, mas apenas os seus assinantes que estão na mesma rodovia, localizados atrás do congestionamento! Ou nós podemos monitorar os doutores que prescrevem comportamentos, passivamente, e ver como a difusão de inovação de fármacos ocorre entre [redes] os médicos. Ou de novo, nós podemos monitorar o comportamento de compras das pessoas, e assistir como esse tipo de fenômeno se difunde dentro das populações humanas.

And there are three ways, I think, that these massive-passive data can be used. One is fully passive, like I just described -- as in, for instance, the trucker example, where we don't actually intervene in the population in any way. One is quasi-active, like the flu example I gave, where we get some people to nominate their friends and then passively monitor their friends -- do they have the flu, or not? -- and then get warning. Or another example would be, if you're a phone company, you figure out who's central in the network and you ask those people, "Look, will you just text us your fever every day? Just text us your temperature." And collect vast amounts of information about people's temperature, but from centrally located individuals. And be able, on a large scale, to monitor an impending epidemic with very minimal input from people. Or, finally, it can be more fully active -- as I know subsequent speakers will also talk about today -- where people might globally participate in wikis, or photographing, or monitoring elections, and upload information in a way that allows us to pool information in order to understand social processes and social phenomena.

E há três formas, eu acho, que esses dados massivos e passivos podem ser usados. Uma é inteiramente passiva, como eu já descrevi. como o exemplo do caminhoneiro, em que nós não intervimos na população de nenhuma forma. Uma é quase ativa, como o exemplo da gripe que eu dei, em que nós pegamos algumas pessoas para indicarem os seus amigos e então monitoramos passivamente os seus amigos -- eles têm ou não gripe? -- e então demos o alerta. Ou outro exemplo seria, se você é de uma companhia telefônica, você descobre quem é central na rede, e você pede a essas pessoas: "Olhe, você nos envia por texto sua temperatura todos os dias? Apenas nos envie sua temperatura." E coleta vastos conjuntos de informação sobre a temperatura das pessoas, mas de indivíduos centralmente localizados. E é capaz, em larga escala, de monitorar uma epidemia iminente com o mínimo de informação das pessoas. Ou pode ser inteiramente ativo -- como eu sei que palestrantes seguintes falarão hoje -- em que pessoas podem participar globalmente em wikis, ou fotografando, ou monitorando eleições, ou uploading informação de uma forma que nós possamos compartilhá-la para entender os processos sociais e o fenômeno social.

In fact, the availability of these data, I think, heralds a kind of new era of what I and others would like to call "computational social science." It's sort of like when Galileo invented -- or, didn't invent -- came to use a telescope and could see the heavens in a new way, or Leeuwenhoek became aware of the microscope -- or actually invented -- and could see biology in a new way. But now we have access to these kinds of data that allow us to understand social processes and social phenomena in an entirely new way that was never before possible. And with this science, we can understand how exactly the whole comes to be greater than the sum of its parts. And actually, we can use these insights to improve society and improve human well-being.

De fato, a disponibilidade desses dados, eu acho, anuncia um tipo de nova era, que eu e outros gostaríamos de chamar "ciência social computacional." É como quando Galileu inventou -- ou ele não inventou -- usou o telescópio e pôde ver o céu de uma nova maneira, ou Leeuwenhoek se tornou consciente do microscópio -- ou na verdade, inventou -- e pôde ver a biologia de uma nova maneira. Mas agora nós temos acesso a esse tipo de dados que nos permite entender os processos sociais e o fenômeno social e uma forma inteiramente nova que antes não era possível. E com essa ciência, nós podemos entender como, exatamente, o conjunto passa a ser maior que a soma de suas partes. E na verdade, nós podemos usar esses insights para melhorar a sociedade e o bem-estar humano.

Thank you.

Obrigado.

Thank you.

Obrigado.

Nicholas Christakis: How social networks predict epidemics

Nicholas Christakis: How social networks predict epidemics

Related talks

Nicholas Christakis: The hidden influence of social networks

Dan Dennett: Dangerous memes

Laurie Garrett: Lessons from the 1918 flu

Gary Slutkin: Let's treat violence like a contagious disease

Andreas Raptopoulos: No roads? There's a drone for that

Eric Berlow and Sean Gourley: Mapping ideas worth spreading

Related talks

Nicholas Christakis: The hidden influence of social networks

Dan Dennett: Dangerous memes

Laurie Garrett: Lessons from the 1918 flu

Gary Slutkin: Let's treat violence like a contagious disease

Andreas Raptopoulos: No roads? There's a drone for that

Eric Berlow and Sean Gourley: Mapping ideas worth spreading