So you go to the doctor and get some tests. The doctor determines that you have high cholesterol and you would benefit from medication to treat it. So you get a pillbox. You have some confidence, your physician has some confidence that this is going to work. The company that invented it did a lot of studies, submitted it to the FDA. They studied it very carefully, skeptically, they approved it. They have a rough idea of how it works, they have a rough idea of what the side effects are. It should be OK. You have a little more of a conversation with your physician and the physician is a little worried because you've been blue, haven't felt like yourself, you haven't been able to enjoy things in life quite as much as you usually do. Your physician says, "You know, I think you have some depression. I'm going to have to give you another pill."
Vas ao médico e fas análises. O médico diche que tes o colesterol alto e que é mellor que te poñas en tratamento. Recéitache unhas pílulas. Tes confianza, o teu médico confía en que funcionará. A compañía que as creou fixo moitas análises, enviounas á FDA. Estudounas con coidado, con escepticismo, aprobounas. Teñen unha vaga idea de como funcionan, teñen unha vaga idea dos efectos secundarios. Debería ir todo ben. Falas un pouco máis co teu médico, o médico está preocupado porque estiveches deprimido, notábaste distinto, non gozabas das cousas da vida tanto coma antes. O médico diche: "Creo que tes depresión. Vouche ter que dar outra pílula".
So now we're talking about two medications. This pill also -- millions of people have taken it, the company did studies, the FDA looked at it -- all good. Think things should go OK. Think things should go OK. Well, wait a minute. How much have we studied these two together?
Así que agora falamos de dous medicamentos. Esta pílula tamén... moita xente a tomou, a compañía fixo análises, a FDA revisouna... todo ben. Pensas que todo debería ir ben. Pensas que todo debería ir ben. Espera un momento. Cantos estudos se fixeron das dúas xuntas?
Well, it's very hard to do that. In fact, it's not traditionally done. We totally depend on what we call "post-marketing surveillance," after the drugs hit the market. How can we figure out if bad things are happening between two medications? Three? Five? Seven? Ask your favorite person who has several diagnoses how many medications they're on.
Iso é complicado de facer. De feito, o normal é que non se faga. Dependemos totalmente do que chamamos "vixilancia poscomercialización", cando as pílulas xa están no mercado. Como podemos saber se algo está indo mal entre dous medicamentos? Ou tres? Ou cinco? Ou sete? Pregúntalle canta medicación toma a alguén con varios diagnósticos.
Why do I care about this problem? I care about it deeply. I'm an informatics and data science guy and really, in my opinion, the only hope -- only hope -- to understand these interactions is to leverage lots of different sources of data in order to figure out when drugs can be used together safely and when it's not so safe.
Por que me preocupo por isto? Preocúpame moito. Son un home da ciencia dos datos e da informática e, na miña opinión, a única esperanza... a única... para entender estas interaccións é aproveitar as máximas fontes de información posibles para determinar cando é seguro usar xuntos os medicamentos e cando non é tan seguro.
So let me tell you a data science story. And it begins with my student Nick. Let's call him "Nick," because that's his name.
Cóntovos unha historia da ciencia dos datos. Empeza co meu alumno Nick. Ímoslle chamar "Nick", porque se chama así.
(Laughter)
(Risas)
Nick was a young student. I said, "You know, Nick, we have to understand how drugs work and how they work together and how they work separately, and we don't have a great understanding. But the FDA has made available an amazing database. It's a database of adverse events. They literally put on the web -- publicly available, you could all download it right now -- hundreds of thousands of adverse event reports from patients, doctors, companies, pharmacists. And these reports are pretty simple: it has all the diseases that the patient has, all the drugs that they're on, and all the adverse events, or side effects, that they experience. It is not all of the adverse events that are occurring in America today, but it's hundreds and hundreds of thousands of drugs.
Nick era un alumno novo. Eu díxenlle: "Temos que entender como funcionan os medicamentos e como funcionan xuntos e por separado, e non sabemos moito diso”. Pero a FDA dispoñibilizou unha incrible base de datos. É unha base de datos de efectos adversos. Subiron a Internet... dispoñible para o público, calquera pode descargalos... centos de miles de informes sobre efectos adversos de pacientes, médicos, empresas, farmacéuticos. Son informes bastante sinxelos: están todas as enfermidades dos pacientes, os medicamentos que toman, e os efectos adversos ou secundarios que sofren. Non están todos os efectos adversos actuais dos Estados Unidos, pero hai centos e centos de miles de medicamentos.
So I said to Nick, "Let's think about glucose. Glucose is very important, and we know it's involved with diabetes. Let's see if we can understand glucose response. I sent Nick off. Nick came back.
Entón díxenlle a Nick: "Imos pensar na glicosa. A glicosa é moi importante e sabemos que ten que ver coa diabetes. A ver se entendemos a resposta á glicosa. Nick marchou para outro lado. Nick volveu.
"Russ," he said, "I've created a classifier that can look at the side effects of a drug based on looking at this database, and can tell you whether that drug is likely to change glucose or not."
"Russ" -dixo el- "Creei un clasificador que pode ver os efectos secundarios dun medicamento buscando nesta base de datos, e pode dicir se é probable que o medicamento altere a glicosa".
He did it. It was very simple, in a way. He took all the drugs that were known to change glucose and a bunch of drugs that don't change glucose, and said, "What's the difference in their side effects? Differences in fatigue? In appetite? In urination habits?" All those things conspired to give him a really good predictor. He said, "Russ, I can predict with 93 percent accuracy when a drug will change glucose."
Fixérao. En certo modo era moi simple. Colleu os medicamentos que se sabe que alteran a glicosa e un feixe de medicamentos que non a alteran, e preguntou: "Que diferenza hai entre os efectos secundarios? Hai diferenzas de fatiga? De apetito? Dos hábitos urinarios?" Todo isto conspirou para facer un bo método preditivo. Dixo: "Russ, podo predicir cun 93% de precisión cando vai cambiar a glicosa".
I said, "Nick, that's great." He's a young student, you have to build his confidence. "But Nick, there's a problem. It's that every physician in the world knows all the drugs that change glucose, because it's core to our practice. So it's great, good job, but not really that interesting, definitely not publishable."
Eu dixen: "Xenial, Nick". É un alumno novo, hai que reforzarlle a confianza. "Pero Nick, hai un problema. Todos os médicos do mundo saben qué medicamentos cambian a glicosa, porque é algo básico na nosa práctica. Así que estupendo, bo traballo, pero non moi interesante realmente, definitivamente non publicable".
(Laughter)
(Risas)
He said, "I know, Russ. I thought you might say that." Nick is smart. "I thought you might say that, so I did one other experiment. I looked at people in this database who were on two drugs, and I looked for signals similar, glucose-changing signals, for people taking two drugs, where each drug alone did not change glucose, but together I saw a strong signal."
El dixo: "Xa sei. Pensei que dirías iso". Nick é listo. "Pensei que o dirías, por iso fixen outro experimento. Busquei na base de datos persoas que tomasen dous fármacos, e busquei sinais semellantes, sinais de alteración da glicosa, en xente que toma dous fármacos, cada un dos cales por si só non alterase a glicosa, pero xuntos presentasen un sinal forte".
And I said, "Oh! You're clever. Good idea. Show me the list." And there's a bunch of drugs, not very exciting. But what caught my eye was, on the list there were two drugs: paroxetine, or Paxil, an antidepressant; and pravastatin, or Pravachol, a cholesterol medication.
E eu dixen: "Que listo es! Boa idea. Ensíname a lista". E había medicamentos apenas interesantes, pero chamoume a atención que na lista había dous: paroxetina, ou Paxil, un antidepresivo, e pravastatina, ou Pravachol, un medicamento para o colesterol.
And I said, "Huh. There are millions of Americans on those two drugs." In fact, we learned later, 15 million Americans on paroxetine at the time, 15 million on pravastatin, and a million, we estimated, on both. So that's a million people who might be having some problems with their glucose if this machine-learning mumbo jumbo that he did in the FDA database actually holds up. But I said, "It's still not publishable, because I love what you did with the mumbo jumbo, with the machine learning, but it's not really standard-of-proof evidence that we have." So we have to do something else. Let's go into the Stanford electronic medical record. We have a copy of it that's OK for research, we removed identifying information. And I said, "Let's see if people on these two drugs have problems with their glucose."
E dixen: "Ah! Millóns de estadounidenses toman estes dous medicamentos". De feito, despois soubemos que 15 millóns toman paroxetina, 15 millóns pravastatina, e calculamos que un millón, as dúas. Entón un millón de persoas poderían estar tendo problemas de glicosa se este galimatías automático que fixo na base de datos da FDA se sostén realmente. Pero eu dixen: "Aínda non é publicable, encántame o que fixeches coa lea esta, coa aprendizaxe automática pero o que temos non é unha proba evidente". Temos que facer algo máis. Imos ao rexistro médico electrónico de Stanford. Temos unha copia que serve para investigar, quitámoslle a información identificativa. E dixen: "Imos ver se a xente que toma eses fármacos ten problemas de glicosa".
Now there are thousands and thousands of people in the Stanford medical records that take paroxetine and pravastatin. But we needed special patients. We needed patients who were on one of them and had a glucose measurement, then got the second one and had another glucose measurement, all within a reasonable period of time -- something like two months. And when we did that, we found 10 patients. However, eight out of the 10 had a bump in their glucose when they got the second P -- we call this P and P -- when they got the second P. Either one could be first, the second one comes up, glucose went up 20 milligrams per deciliter. Just as a reminder, you walk around normally, if you're not diabetic, with a glucose of around 90. And if it gets up to 120, 125, your doctor begins to think about a potential diagnosis of diabetes. So a 20 bump -- pretty significant.
Hai miles de persoas nos rexistros médicos de Stanford que toman paroxetina e pravastatina, pero necesitabamos pacientes especiais. Necesitabamos pacientes que tomasen un deles e medisen a glicosa, e despois tomasen o outro e medisen outra vez a glicosa, todo dentro dun tempo razoable... algo así como dous meses. E cando o fixemos encontramos 10 pacientes. Con todo, oito de cada dez tiveron aumento de glicosa cando tomaron o segundo P —chamámoslles P e P— cando tomaron o segundo P. Fose cal fose o primeiro, cando tomaban o segundo, a glicosa subía 20 miligramos por decilitro. Só para situarnos, normalmente andamos, se non somos diabéticos, coa glicosa arredor de 90. Se sobe ata 120, 125, o médico empeza a pensar nun posible diagnóstico de diabetes. Así que un aumento de 20... é bastante significativo.
I said, "Nick, this is very cool. But, I'm sorry, we still don't have a paper, because this is 10 patients and -- give me a break -- it's not enough patients."
Eu dixen: "Nick, está moi ben, pero, síntoo, aínda non temos artigo, porque estes 10 pacientes -necesito respirar- non abondan".
So we said, what can we do? And we said, let's call our friends at Harvard and Vanderbilt, who also -- Harvard in Boston, Vanderbilt in Nashville, who also have electronic medical records similar to ours. Let's see if they can find similar patients with the one P, the other P, the glucose measurements in that range that we need.
Que podemos facer? Vamos chamar aos amigos de Harvard e Vanderbilt, ... Harvard en Boston, Vanderbilt en Nashville, que tamén teñen historias clínicas electrónicas parecidas. A ver se encontran pacientes parecidos cun P, o outro P, as medicións de glicosa no rango que necesitamos.
God bless them, Vanderbilt in one week found 40 such patients, same trend. Harvard found 100 patients, same trend. So at the end, we had 150 patients from three diverse medical centers that were telling us that patients getting these two drugs were having their glucose bump somewhat significantly.
Non podía crelo, Vanderbilt nunha semana encontrou 40 pacientes deses, coa mesma tendencia. e Harvard encontrou 100, coa mesma tendencia. Ao final, tiñamos 150 pacientes de tres centros médicos diferentes que nos dicían que os pacientes que tomaban eses dous medicamentos tiñan un aumento de glicosa considerable.
More interestingly, we had left out diabetics, because diabetics already have messed up glucose. When we looked at the glucose of diabetics, it was going up 60 milligrams per deciliter, not just 20. This was a big deal, and we said, "We've got to publish this." We submitted the paper. It was all data evidence, data from the FDA, data from Stanford, data from Vanderbilt, data from Harvard. We had not done a single real experiment.
Máis interesante aínda, deixaramos fóra os diabéticos, porque a diabetes xa afecta á glicosa. Cando nos fixamos na glicosa dos diabéticos, vimos que subía ata 60 miligramos por decilitro, non só 20. Isto era importante e dixemos: "Temos que publicalo". Enviamos o artigo. Todas as probas eran datos, datos da FDA, datos de Stanford, datos de Vanderbilt, de Harvard. Non fixeramos un só experimento real.
But we were nervous. So Nick, while the paper was in review, went to the lab. We found somebody who knew about lab stuff. I don't do that. I take care of patients, but I don't do pipettes. They taught us how to feed mice drugs. We took mice and we gave them one P, paroxetine. We gave some other mice pravastatin. And we gave a third group of mice both of them. And lo and behold, glucose went up 20 to 60 milligrams per deciliter in the mice.
Pero estabamos nerviosos. Así que Nick, mentres revisaban o artigo, foi ao laboratorio. Encontramos unha persoa que entendía de laboratorio. Eu non sei diso. Encárgome de pacientes, non traballo con pipetas. Ensináronnos a darlles os medicamentos a ratos. Collemos uns ratos e démoslles un P, paroxetina. A outros ratos démoslles pravastatina, e a un terceiro grupo démoslles os dous. Mira por onde, a glicosa aumentou de 20 a 60 miligramos por decilitro nos ratos.
So the paper was accepted based on the informatics evidence alone, but we added a little note at the end, saying, oh by the way, if you give these to mice, it goes up.
Aceptaron o artigo só coas probas informáticas, pero engadimos unha notiña ao final que poñía ah por certo, se se proba con ratos, aumenta.
That was great, and the story could have ended there. But I still have six and a half minutes.
Foi xenial e a historia podería acabar aquí, pero aínda teño seis minutos e medio.
(Laughter)
(Risas)
So we were sitting around thinking about all of this, and I don't remember who thought of it, but somebody said, "I wonder if patients who are taking these two drugs are noticing side effects of hyperglycemia. They could and they should. How would we ever determine that?"
Entón estabamos sen facer nada pensando en todo isto, e non recordo quen foi, pero alguén dixo: "Pregúntome se os pacientes que toman estes dous fármacos están notando efectos secundarios de hiperglicemia. Poderían e deberían. Como poderiamos determinar isto?"
We said, well, what do you do? You're taking a medication, one new medication or two, and you get a funny feeling. What do you do? You go to Google and type in the two drugs you're taking or the one drug you're taking, and you type in "side effects." What are you experiencing? So we said OK, let's ask Google if they will share their search logs with us, so that we can look at the search logs and see if patients are doing these kinds of searches. Google, I am sorry to say, denied our request. So I was bummed. I was at a dinner with a colleague who works at Microsoft Research and I said, "We wanted to do this study, Google said no, it's kind of a bummer." He said, "Well, we have the Bing searches."
Que é o que se fai? Estás tomando un medicamento novo ou dous e tes unha sensación rara. Que fas? Vas a Google e introduces o nome dos medicamentos que estás tomando. e escribes "efectos secundarios". Que sentes? Entón dixemos: vale, imos pedirlle a Google que comparta os rexistros de buscas con nós, así poderemos revisalos e ver se os pacientes fan ese tipo de buscas. Sinto dicilo, pero Google rexeitou a petición. Quedei desanimado. Nunha cea cun colega que traballa na Microsoft Research conteillo: "Queriamos facer un estudo, Google dixo que non, vaia decepción". El dixo: "Temos as buscas de Bing".
(Laughter)
(Risas)
Yeah. That's great. Now I felt like I was --
Si. Estupendo. Sentinme coma se...
(Laughter)
(Risas)
I felt like I was talking to Nick again. He works for one of the largest companies in the world, and I'm already trying to make him feel better. But he said, "No, Russ -- you might not understand. We not only have Bing searches, but if you use Internet Explorer to do searches at Google, Yahoo, Bing, any ... Then, for 18 months, we keep that data for research purposes only." I said, "Now you're talking!" This was Eric Horvitz, my friend at Microsoft.
Sentinme coma se falase con Nick. Traballa para unha das empresas máis grandes do mundo, e eu estou intentando facer que se sinta ben. Pero el dixo: "Non, Russ... creo que non entendiches. Non só temos as buscas de Bing, se usas Internet Explorer para facer buscas en Google, Yahoo, Bing, calquera... durante 18 meses, gardamos os datos para usalos en investigación". Eu dixen: "Agora falaches!" O meu amigo en Microsoft era Eric Horvitz.
So we did a study where we defined 50 words that a regular person might type in if they're having hyperglycemia, like "fatigue," "loss of appetite," "urinating a lot," "peeing a lot" -- forgive me, but that's one of the things you might type in. So we had 50 phrases that we called the "diabetes words." And we did first a baseline. And it turns out that about .5 to one percent of all searches on the Internet involve one of those words. So that's our baseline rate. If people type in "paroxetine" or "Paxil" -- those are synonyms -- and one of those words, the rate goes up to about two percent of diabetes-type words, if you already know that there's that "paroxetine" word. If it's "pravastatin," the rate goes up to about three percent from the baseline. If both "paroxetine" and "pravastatin" are present in the query, it goes up to 10 percent, a huge three- to four-fold increase in those searches with the two drugs that we were interested in, and diabetes-type words or hyperglycemia-type words.
Así que fixemos un estudo no que definimos 50 palabras que unha persoa podería teclear se padecía hiperglicemia, como "fatiga", "perda de apetito", "ouriñar moito", "mexar moito"... perdón, pero é unha das cousas que se poderían escribir. A esas 50 frases chamámoslles "palabras de diabetes". Primeiro marcamos un punto de referencia. Resultou que, máis ou menos, do 0,5 ao 1 por cento de todas as buscas en Internet incluían unha desas palabras. Esa foi a nosa taxa de referencia. Se alguén teclea "paroxetina" ou "Paxil" -son sinónimos- e unha desas palabras, a taxa sobe ata un 2% das palabras de tipo diabetes, se xa sabemos que está a palabra "paroxetina". Se é "pravastatina", a taxa sobe a arredor dun 3% da referencia. Se na consulta aparecen "paroxetina" e"pravastatina", sobe ata o 10%, un grande aumento de tres a catro veces nas buscas cos dous medicamentos que nos interesaban e as palabras relacionadas con diabetes ou con hiperglicemia.
We published this, and it got some attention. The reason it deserves attention is that patients are telling us their side effects indirectly through their searches. We brought this to the attention of the FDA. They were interested. They have set up social media surveillance programs to collaborate with Microsoft, which had a nice infrastructure for doing this, and others, to look at Twitter feeds, to look at Facebook feeds, to look at search logs, to try to see early signs that drugs, either individually or together, are causing problems.
Publicámolo, e conseguiu algo de atención. A razón pola que merece atención é que os pacientes estannos contando os efectos secundarios indirectamente a través das buscas. Chamamos a atención da FDA sobre isto. Interesoulles. Tiñan programas de vixilancia dos medios sociais para colaborar con Microsoft, que tiñan boa infraestrutura para facer isto, e outros, para observar os contidos do Twitter, do Facebook, os rexistros das buscas, para buscar sinais de que os medicamentos, por separado ou en conxunto, están causando problemas.
What do I take from this? Why tell this story? Well, first of all, we have now the promise of big data and medium-sized data to help us understand drug interactions and really, fundamentally, drug actions. How do drugs work? This will create and has created a new ecosystem for understanding how drugs work and to optimize their use. Nick went on; he's a professor at Columbia now. He did this in his PhD for hundreds of pairs of drugs. He found several very important interactions, and so we replicated this and we showed that this is a way that really works for finding drug-drug interactions.
Que saco disto? Por que conto esta historia? Primeiro, temos a promesa dos datos masivos ou de tamaño medio de axudarnos a entender as interaccións entre medicamentos e, fundamentalmente, as súas accións. Como funcionan os medicamentos? Isto creará e xa creou un novo ecosistema para entender como funcionan os medicamentos e optimizar o seu uso. Nick seguiu adiante; agora é profesor en Columbia. Fixo isto no doutoramento con centos de pares de medicamentos. Encontrou interaccións moi importantes, por iso o volvemos facer e demostramos que o método realmente funciona para encontrar interaccións entre medicamentos.
However, there's a couple of things. We don't just use pairs of drugs at a time. As I said before, there are patients on three, five, seven, nine drugs. Have they been studied with respect to their nine-way interaction? Yes, we can do pair-wise, A and B, A and C, A and D, but what about A, B, C, D, E, F, G all together, being taken by the same patient, perhaps interacting with each other in ways that either makes them more effective or less effective or causes side effects that are unexpected? We really have no idea. It's a blue sky, open field for us to use data to try to understand the interaction of drugs.
Pero, hai un par de cousas. Non só usamos pares de medicamentos á vez. Como dixen, hai pacientes que toman tres, cinco, sete, nove medicamentos. Hai algún estudo relacionado coa interacción dos nove? Si, podemos comparar por pares, A e B, A e C, A e D, pero que pasa con A, B, C, D, E, F, G xuntos, cando os toma o mesmo paciente, quizais interactuando entre eles en modos que os fan máis eficaces ou menos ou que causan efectos secundarios inesperados? Realmente non temos nin idea. Para nós é un campo aberto o feito de utilizar datos para ver de entender a interacción dos medicamentos.
Two more lessons: I want you to think about the power that we were able to generate with the data from people who had volunteered their adverse reactions through their pharmacists, through themselves, through their doctors, the people who allowed the databases at Stanford, Harvard, Vanderbilt, to be used for research. People are worried about data. They're worried about their privacy and security -- they should be. We need secure systems. But we can't have a system that closes that data off, because it is too rich of a source of inspiration, innovation and discovery for new things in medicine.
Dúas leccións máis: Quero que pensen na forza que puidemos xerar cos datos da xente que aceptou compartir as súas reaccións adversas por medio dos farmacéuticos, entre eles mesmos, dos seus médicos, a xente que permitiu que as bases de datos de Stanford, Harvard, Vanderbilt, se usasen para investigar. Á xente preocúpana os datos. Preocúpaa a privacidade e a seguridade... teñen razón. Necesitamos sistemas seguros. Pero non podemos ter un sistema que impida acceder a eses datos, porque é unha fonte demasiado rica de inspiración, innovación e descubrimento de cousas novas en medicina.
And the final thing I want to say is, in this case we found two drugs and it was a little bit of a sad story. The two drugs actually caused problems. They increased glucose. They could throw somebody into diabetes who would otherwise not be in diabetes, and so you would want to use the two drugs very carefully together, perhaps not together, make different choices when you're prescribing. But there was another possibility. We could have found two drugs or three drugs that were interacting in a beneficial way. We could have found new effects of drugs that neither of them has alone, but together, instead of causing a side effect, they could be a new and novel treatment for diseases that don't have treatments or where the treatments are not effective. If we think about drug treatment today, all the major breakthroughs -- for HIV, for tuberculosis, for depression, for diabetes -- it's always a cocktail of drugs.
O último que quero dicir é: neste caso encontramos dous fármacos e foi unha historia algo triste. Os dous medicamentos causaban problemas. Aumentaban a glicosa. Podían provocarlle diabetes a alguén que doutro modo non a tería, por iso o desexable é usar os dous medicamentos xuntos con coidado, quizais nin xuntos, escoller outros á hora de receitar. Pero hai outra posibilidade. Poderiamos encontrar dous ou tres medicamentos que interactuasen de forma beneficiosa. Poderiamos encontrar efectos novos que ningún dos fármacos ten por separado, pero xuntos, en vez de causar efectos secundarios, poderían ser un tratamento novidoso para as enfermidades sen tratamentos ou con tratamentos pouco efectivos. Se pensamos nos tratamentos con medicamentos hoxe, todos os avances importantes... para o VIH, a tuberculose, a depresión, a diabetes... sempre son un cóctel de medicamentos.
And so the upside here, and the subject for a different TED Talk on a different day, is how can we use the same data sources to find good effects of drugs in combination that will provide us new treatments, new insights into how drugs work and enable us to take care of our patients even better?
O lado positivo aquí, e un tema para outra conferencia TED noutro día, é como podemos usar as mesmas fontes de datos para encontrar efectos positivos na combinación de medicamentos que nos proporcionen novos tratamentos, ideas de como funcionan os fármacos e nos permitan coidar dos pacientes incluso mellor?
Thank you very much.
Moitas grazas.
(Applause)
(Aplausos)