Nine years ago, my sister discovered lumps in her neck and arm and was diagnosed with cancer. From that day, she started to benefit from the understanding that science has of cancer. Every time she went to the doctor, they measured specific molecules that gave them information about how she was doing and what to do next. New medical options became available every few years. Everyone recognized that she was struggling heroically with a biological illness. This spring, she received an innovative new medical treatment in a clinical trial. It dramatically knocked back her cancer. Guess who I'm going to spend this Thanksgiving with? My vivacious sister, who gets more exercise than I do, and who, like perhaps many people in this room, increasingly talks about a lethal illness in the past tense. Science can, in our lifetimes -- even in a decade -- transform what it means to have a specific illness.
But not for all illnesses. My friend Robert and I were classmates in graduate school. Robert was smart, but with each passing month, his thinking seemed to become more disorganized. He dropped out of school, got a job in a store ... But that, too, became too complicated. Robert became fearful and withdrawn. A year and a half later, he started hearing voices and believing that people were following him. Doctors diagnosed him with schizophrenia, and they gave him the best drug they could. That drug makes the voices somewhat quieter, but it didn't restore his bright mind or his social connectedness. Robert struggled to remain connected to the worlds of school and work and friends. He drifted away, and today I don't know where to find him. If he watches this, I hope he'll find me.
Why does medicine have so much to offer my sister, and so much less to offer millions of people like Robert? The need is there. The World Health Organization estimates that brain illnesses like schizophrenia, bipolar disorder and major depression are the world's largest cause of lost years of life and work. That's in part because these illnesses often strike early in life, in many ways, in the prime of life, just as people are finishing their educations, starting careers, forming relationships and families. These illnesses can result in suicide; they often compromise one's ability to work at one's full potential; and they're the cause of so many tragedies harder to measure: lost relationships and connections, missed opportunities to pursue dreams and ideas. These illnesses limit human possibilities in ways we simply cannot measure.
We live in an era in which there's profound medical progress on so many other fronts. My sister's cancer story is a great example, and we could say the same of heart disease. Drugs like statins will prevent millions of heart attacks and strokes. When you look at these areas of profound medical progress in our lifetimes, they have a narrative in common: scientists discovered molecules that matter to an illness, they developed ways to detect and measure those molecules in the body, and they developed ways to interfere with those molecules using other molecules -- medicines. It's a strategy that has worked again and again and again. But when it comes to the brain, that strategy has been limited, because today, we don't know nearly enough, yet, about how the brain works. We need to learn which of our cells matter to each illness, and which molecules in those cells matter to each illness. And that's the mission I want to tell you about today.
My lab develops technologies with which we try to turn the brain into a big-data problem. You see, before I became a biologist, I worked in computers and math, and I learned this lesson: wherever you can collect vast amounts of the right kinds of data about the functioning of a system, you can use computers in powerful new ways to make sense of that system and learn how it works. Today, big-data approaches are transforming ever-larger sectors of our economy, and they could do the same in biology and medicine, too. But you have to have the right kinds of data. You have to have data about the right things. And that often requires new technologies and ideas. And that is the mission that animates the scientists in my lab.
Today, I want to tell you two short stories from our work. One fundamental obstacle we face in trying to turn the brain into a big-data problem is that our brains are composed of and built from billions of cells. And our cells are not generalists; they're specialists. Like humans at work, they specialize into thousands of different cellular careers, or cell types.
In fact, each of the cell types in our body could probably give a lively TED Talk about what it does at work. But as scientists, we don't even know today how many cell types there are, and we don't know what the titles of most of those talks would be. Now, we know many important things about cell types. They can differ dramatically in size and shape. One will respond to a molecule that the other doesn't respond to, they'll make different molecules. But science has largely been reaching these insights in an ad hoc way, one cell type at a time, one molecule at a time. We wanted to make it possible to learn all of this quickly and systematically.
Now, until recently, it was the case that if you wanted to inventory all of the molecules in a part of the brain or any organ, you had to first grind it up into a kind of cellular smoothie. But that's a problem. As soon as you've ground up the cells, you can only study the contents of the average cell -- not the individual cells. Imagine if you were trying to understand how a big city like New York works, but you could only do so by reviewing some statistics about the average resident of New York. Of course, you wouldn't learn very much, because everything that's interesting and important and exciting is in all the diversity and the specializations. And the same thing is true of our cells. And we wanted to make it possible to study the brain not as a cellular smoothie but as a cellular fruit salad, in which one could generate data about and learn from each individual piece of fruit.
So we developed a technology for doing that. You're about to see a movie of it. Here we're packaging tens of thousands of individual cells, each into its own tiny water droplet for its own molecular analysis. When a cell lands in a droplet, it's greeted by a tiny bead, and that bead delivers millions of DNA bar code molecules. And each bead delivers a different bar code sequence to a different cell. We incorporate the DNA bar codes into each cell's RNA molecules. Those are the molecular transcripts it's making of the specific genes that it's using to do its job. And then we sequence billions of these combined molecules and use the sequences to tell us which cell and which gene every molecule came from.
We call this approach "Drop-seq," because we use droplets to separate the cells for analysis, and we use DNA sequences to tag and inventory and keep track of everything. And now, whenever we do an experiment, we analyze tens of thousands of individual cells. And today in this area of science, the challenge is increasingly how to learn as much as we can as quickly as we can from these vast data sets.
When we were developing Drop-seq, people used to tell us, "Oh, this is going to make you guys the go-to for every major brain project." That's not how we saw it. Science is best when everyone is generating lots of exciting data. So we wrote a 25-page instruction book, with which any scientist could build their own Drop-seq system from scratch. And that instruction book has been downloaded from our lab website 50,000 times in the past two years. We wrote software that any scientist could use to analyze the data from Drop-seq experiments, and that software is also free, and it's been downloaded from our website 30,000 times in the past two years. And hundreds of labs have written us about discoveries that they've made using this approach. Today, this technology is being used to make a human cell atlas. It will be an atlas of all of the cell types in the human body and the specific genes that each cell type uses to do its job.
Now I want to tell you about a second challenge that we face in trying to turn the brain into a big data problem. And that challenge is that we'd like to learn from the brains of hundreds of thousands of living people. But our brains are not physically accessible while we're living. But how can we discover molecular factors if we can't hold the molecules? An answer comes from the fact that the most informative molecules, proteins, are encoded in our DNA, which has the recipes our cells follow to make all of our proteins. And these recipes vary from person to person to person in ways that cause the proteins to vary from person to person in their precise sequence and in how much each cell type makes of each protein. It's all encoded in our DNA, and it's all genetics, but it's not the genetics that we learned about in school.
Do you remember big B, little b? If you inherit big B, you get brown eyes? It's simple. Very few traits are that simple. Even eye color is shaped by much more than a single pigment molecule. And something as complex as the function of our brains is shaped by the interaction of thousands of genes. And each of these genes varies meaningfully from person to person to person, and each of us is a unique combination of that variation. It's a big data opportunity. And today, it's increasingly possible to make progress on a scale that was never possible before. People are contributing to genetic studies in record numbers, and scientists around the world are sharing the data with one another to speed progress.
I want to tell you a short story about a discovery we recently made about the genetics of schizophrenia. It was made possible by 50,000 people from 30 countries, who contributed their DNA to genetic research on schizophrenia. It had been known for several years that the human genome's largest influence on risk of schizophrenia comes from a part of the genome that encodes many of the molecules in our immune system. But it wasn't clear which gene was responsible. A scientist in my lab developed a new way to analyze DNA with computers, and he discovered something very surprising. He found that a gene called "complement component 4" -- it's called "C4" for short -- comes in dozens of different forms in different people's genomes, and these different forms make different amounts of C4 protein in our brains. And he found that the more C4 protein our genes make, the greater our risk for schizophrenia.
Now, C4 is still just one risk factor in a complex system. This isn't big B, but it's an insight about a molecule that matters. Complement proteins like C4 were known for a long time for their roles in the immune system, where they act as a kind of molecular Post-it note that says, "Eat me." And that Post-it note gets put on lots of debris and dead cells in our bodies and invites immune cells to eliminate them. But two colleagues of mine found that the C4 Post-it note also gets put on synapses in the brain and prompts their elimination. Now, the creation and elimination of synapses is a normal part of human development and learning. Our brains create and eliminate synapses all the time. But our genetic results suggest that in schizophrenia, the elimination process may go into overdrive.
Scientists at many drug companies tell me they're excited about this discovery, because they've been working on complement proteins for years in the immune system, and they've learned a lot about how they work. They've even developed molecules that interfere with complement proteins, and they're starting to test them in the brain as well as the immune system. It's potentially a path toward a drug that might address a root cause rather than an individual symptom, and we hope very much that this work by many scientists over many years will be successful.
But C4 is just one example of the potential for data-driven scientific approaches to open new fronts on medical problems that are centuries old. There are hundreds of places in our genomes that shape risk for brain illnesses, and any one of them could lead us to the next molecular insight about a molecule that matters. And there are hundreds of cell types that use these genes in different combinations. As we and other scientists work to generate the rest of the data that's needed and to learn all that we can from that data, we hope to open many more new fronts. Genetics and single-cell analysis are just two ways of trying to turn the brain into a big data problem.
There is so much more we can do. Scientists in my lab are creating a technology for quickly mapping the synaptic connections in the brain to tell which neurons are talking to which other neurons and how that conversation changes throughout life and during illness. And we're developing a way to test in a single tube how cells with hundreds of different people's genomes respond differently to the same stimulus. These projects bring together people with diverse backgrounds and training and interests -- biology, computers, chemistry, math, statistics, engineering. But the scientific possibilities rally people with diverse interests into working intensely together.
What's the future that we could hope to create? Consider cancer. We've moved from an era of ignorance about what causes cancer, in which cancer was commonly ascribed to personal psychological characteristics, to a modern molecular understanding of the true biological causes of cancer. That understanding today leads to innovative medicine after innovative medicine, and although there's still so much work to do, we're already surrounded by people who have been cured of cancers that were considered untreatable a generation ago. And millions of cancer survivors like my sister find themselves with years of life that they didn't take for granted and new opportunities for work and joy and human connection. That is the future that we are determined to create around mental illness -- one of real understanding and empathy and limitless possibility.
Thank you.
(Applause)