Sara-Jane Dunn: The next software revolution: programming biological cells

The second half of the last century was completely defined by a technological revolution: the software revolution. The ability to program electrons on a material called silicon made possible technologies, companies and industries that were at one point unimaginable to many of us, but which have now fundamentally changed the way the world works. The first half of this century, though, is going to be transformed by a new software revolution: the living software revolution. And this will be powered by the ability to program biochemistry on a material called biology. And doing so will enable us to harness the properties of biology to generate new kinds of therapies, to repair damaged tissue, to reprogram faulty cells or even build programmable operating systems out of biochemistry. If we can realize this -- and we do need to realize it -- its impact will be so enormous that it will make the first software revolution pale in comparison.

Вторая половина XX века была полностью определена технологической революцией: революцией программного обеспечения. Возможность программирования электронов на кремниевых материалах дала начало таким технологиям, компаниям и индустриям, которые многие из нас не могли себе даже представить, но которые уже полностью изменили нашу жизнь. Однако первая половина нынешнего столетия будет изменена революцией нового программного обеспечения: революцией программного обеспечения живых систем. Это возможно посредством биохимического программирования на биологических материалах. Это позволит нам использовать биологические свойства для создания новых видов терапии, восстановления повреждённых тканей, перепрограммирования неисправных клеток или даже создания программируемых операционных систем с помощью биохимии. Если мы это поймём, а нам нужно это понять, влияние новой технологии будет настолько огромным, что по сравнению с ней первая программная революция будет ничтожна.

And that's because living software would transform the entirety of medicine, agriculture and energy, and these are sectors that dwarf those dominated by IT. Imagine programmable plants that fix nitrogen more effectively or resist emerging fungal pathogens, or even programming crops to be perennial rather than annual so you could double your crop yields each year. That would transform agriculture and how we'll keep our growing and global population fed. Or imagine programmable immunity, designing and harnessing molecular devices that guide your immune system to detect, eradicate or even prevent disease. This would transform medicine and how we'll keep our growing and aging population healthy.

Потому что революция живого ПО перевернёт всю медицину, сельское хозяйство и энергетику, и это секторы перегоняют те, в которых доминируют IT-технологии. Представьте себе программируемые растения, фиксирующие азот более эффективно или противостоящие возникающим грибковым патогенам, или даже модификацию урожая в многолетний вместо годичного, чтобы вы смогли каждый год удваивать свой урожай. Это изменит сельское хозяйство и то, как мы будем кормить растущее мировое население. Или же представьте себе программируемый иммунитет: созданные и использующиеся молекулярные устройства, благодаря которым иммунитет распознаёт, уничтожает или даже предотвращает болезни. Это изменит медицину и то, как мы будем поддерживать здоровье растущего и стареющего населения.

We already have many of the tools that will make living software a reality. We can precisely edit genes with CRISPR. We can rewrite the genetic code one base at a time. We can even build functioning synthetic circuits out of DNA. But figuring out how and when to wield these tools is still a process of trial and error. It needs deep expertise, years of specialization. And experimental protocols are difficult to discover and all too often, difficult to reproduce. And, you know, we have a tendency in biology to focus a lot on the parts, but we all know that something like flying wouldn't be understood by only studying feathers. So programming biology is not yet as simple as programming your computer. And then to make matters worse, living systems largely bear no resemblance to the engineered systems that you and I program every day. In contrast to engineered systems, living systems self-generate, they self-organize, they operate at molecular scales. And these molecular-level interactions lead generally to robust macro-scale output. They can even self-repair.

У нас уже есть много инструментов для осуществления ПО живых систем. Мы можем редактировать гены с помощью CRISPR. Мы можем переписывать генетический код по одному нуклеотиду зараз. Мы даже можем создавать функционирующие синтетические цепи из ДНК. Мы ещё не знаем, как с этим обращаться: мы всё ещё идём по пути проб и ошибок, и для этого требуются глубокие знания и годы специализации. А экспериментальные закономерности трудно обнаружить и ещё чаще трудно воспроизвести. Биологи обычно сосредоточиваются на отдельных частях, но мы все понимаем, что нельзя понять полёт, изучая только перья. Поэтому биопрограммирование сложнее компьютерного программирования. Более того, живые системы в основном непохожи на инженерные системы, разрабатываемые нами ежедневно. По сравнению с инженерными, живые системы самогенерируемы, они самоорганизованы и работают на молекулярном уровне. И эти молекулярные взаимодействия приводят к надёжному результату на макроуровне. Они могут даже самовосстанавливаться.

Consider, for example, the humble household plant, like that one sat on your mantelpiece at home that you keep forgetting to water. Every day, despite your neglect, that plant has to wake up and figure out how to allocate its resources. Will it grow, photosynthesize, produce seeds, or flower? And that's a decision that has to be made at the level of the whole organism. But a plant doesn't have a brain to figure all of that out. It has to make do with the cells on its leaves. They have to respond to the environment and make the decisions that affect the whole plant. So somehow there must be a program running inside these cells, a program that responds to input signals and cues and shapes what that cell will do. And then those programs must operate in a distributed way across individual cells, so that they can coordinate and that plant can grow and flourish.

Представьте себе, например, скромное домашнее растение, как то, что стоит у вас на камине, которое вы забываете поливать. Каждый день, несмотря на вашу забывчивость, растению нужно проснуться и понять, как распределить свои ресурсы. Будет ли оно расти, фотосинтезировать, давать семена или цвести? И это решение должно быть принято на уровне всего организма. Но у растения нет мозга, чтобы обо всём этом думать. Ему приходится обходиться клетками листьев. Они должны среагировать на среду и принять решение, которое затронет всё растение. Должна быть какая-то программа, которая работает внутри этих клеток, которая отвечает на входящие сигналы и определяет дальнейшее действие клетки. Затем эти программы должны распределиться по отдельным клеткам, чтобы те скоординировались, а растение смогло расти и цвести.

If we could understand these biological programs, if we could understand biological computation, it would transform our ability to understand how and why cells do what they do. Because, if we understood these programs, we could debug them when things go wrong. Or we could learn from them how to design the kind of synthetic circuits that truly exploit the computational power of biochemistry.

Если мы поймём работу этих биологических программ, если мы поймём биопрограммирование, это перевернёт наше понимание того, как и почему клетки делают то, что они делают. Потому что, если мы поймём эти программы, мы сможем исправлять их при необходимости. Или же они могли бы научить нас проектировать синтетические цепи, использующие вычислительную мощь биохимии в полной мере.

My passion about this idea led me to a career in research at the interface of maths, computer science and biology. And in my work, I focus on the concept of biology as computation. And that means asking what do cells compute, and how can we uncover these biological programs? And I started to ask these questions together with some brilliant collaborators at Microsoft Research and the University of Cambridge, where together we wanted to understand the biological program running inside a unique type of cell: an embryonic stem cell. These cells are unique because they're totally naïve. They can become anything they want: a brain cell, a heart cell, a bone cell, a lung cell, any adult cell type. This naïvety, it sets them apart, but it also ignited the imagination of the scientific community, who realized, if we could tap into that potential, we would have a powerful tool for medicine. If we could figure out how these cells make the decision to become one cell type or another, we might be able to harness them to generate cells that we need to repair diseased or damaged tissue. But realizing that vision is not without its challenges, not least because these particular cells, they emerge just six days after conception. And then within a day or so, they're gone. They have set off down the different paths that form all the structures and organs of your adult body.

Увлечённость этой идеей привела меня к научной карьере на стыке математики, компьютерных технологий и биологии. В своей работе я фокусируюсь на концепции о том, что биология — это вычислительные операции. Отсюда возникает вопрос: что же вычисляют клетки, и как мы можем выявлять, как работают эти биологические программы? Я начала задавать эти вопросы вместе со своими замечательными коллегами в Microsoft Research и Кембриджском университете. Мы все хотели разобраться в биологической программе, выполнямой в уникальных клетках — это эмбриональные стволовые клетки. Они уникальны, потому что совершенно «наивны». Они могут стать любой другой клеткой: клеткой мозга, сердца, костной ткани, лёгкого, любой клеткой взрослого человека. «Наивность» является их отличительной чертой, но она поразила воображение научного сообщества, осознавшего потенциал этих клеток для использования в медицине. Если мы поймём, как такие клетки принимают решение стать тем или иным типом клеток, мы могли бы использовать их для генерирования клеток, нужных для замены повреждённых тканей. Но при осуществлении этой идеи возникают проблемы, во многом потому, что эти клетки образуются всего лишь через шесть дней после зачатия. А затем примерно за один день исчезают. Они расходятся в разных направлениях, формирующих структуру и органы взрослого организма.

But it turns out that cell fates are a lot more plastic than we might have imagined. About 13 years ago, some scientists showed something truly revolutionary. By inserting just a handful of genes into an adult cell, like one of your skin cells, you can transform that cell back to the naïve state. And it's a process that's actually known as "reprogramming," and it allows us to imagine a kind of stem cell utopia, the ability to take a sample of a patient's own cells, transform them back to the naïve state and use those cells to make whatever that patient might need, whether it's brain cells or heart cells.

Но оказывается, что судьба клеток более пластична, чем мы это себе представляли. Около 13 лет назад учёные показали нечто по-настоящему революционное. Вживив всего несколько генов во взрослую клетку, например, в эпителиальную клетку, можно вернуть эту клетку обратно в «наивное» состояние. Такой процесс называется «перепрограммированием». Он позволяет нам представить себе своего рода утопию стволовых клеток, возможность взять образец клеток пациента, вернуть их в «наивное» состояние и использовать их для пациента, в каких бы клетках он ни нуждался — мозга или сердца.

But over the last decade or so, figuring out how to change cell fate, it's still a process of trial and error. Even in cases where we've uncovered successful experimental protocols, they're still inefficient, and we lack a fundamental understanding of how and why they work. If you figured out how to change a stem cell into a heart cell, that hasn't got any way of telling you how to change a stem cell into a brain cell. So we wanted to understand the biological program running inside an embryonic stem cell, and understanding the computation performed by a living system starts with asking a devastatingly simple question: What is it that system actually has to do?

Но в последнее десятилетие изменение предназначения клетки всё так же остаётся процессом проб и ошибок. Даже в тех случаях, когда мы выработали успешные экспериментальные инструкции, они по-прежнему неэффективны, и у нас нет фундаментальных знаний о том, как и почему они работают. Если вы выяснили, как превратить стволовую клетку в клетку сердца, это не значит, что вы также можете превратить стволовую клетку в клетку мозга. Поэтому мы хотели понять биологический процесс, протекающий внутри эмбриональных стволовых клеток, а понимание вычислений, произведённых живой системой, начинается с невероятно простого вопроса: «Что на самом деле должна делать система?»

Now, computer science actually has a set of strategies for dealing with what it is the software and hardware are meant to do. When you write a program, you code a piece of software, you want that software to run correctly. You want performance, functionality. You want to prevent bugs. They can cost you a lot. So when a developer writes a program, they could write down a set of specifications. These are what your program should do. Maybe it should compare the size of two numbers or order numbers by increasing size. Technology exists that allows us automatically to check whether our specifications are satisfied, whether that program does what it should do. And so our idea was that in the same way, experimental observations, things we measure in the lab, they correspond to specifications of what the biological program should do.

В информатике есть набор стратегий для определения того, что собственно делают программные и аппаратные средства. Когда вы пишете программу, вы кодируете часть ПО и хотите, чтобы ПО работало правильно, было достаточно быстрым и функциональным, а также не содержало ошибок. Они могут дорого обойтись. Когда разработчик пишет программу, он может записать набор спецификаций, которые ваша программа должна осуществить. Возможно, она должна сравнить два числа или расположить числа в возрастающем порядке. Существует технология, позволяющая автоматически проверять, удовлетворены ли спецификации, выполняет ли программа то, что она должна делать. Наша идея заключалась в том, что аналогичным образом экспериментальные наблюдения и измерения в лаборатории соответствуют спецификациям того, что должна делать биопрограмма.

So we just needed to figure out a way to encode this new type of specification. So let's say you've been busy in the lab and you've been measuring your genes and you've found that if Gene A is active, then Gene B or Gene C seems to be active. We can write that observation down as a mathematical expression if we can use the language of logic: If A, then B or C. Now, this is a very simple example, OK. It's just to illustrate the point. We can encode truly rich expressions that actually capture the behavior of multiple genes or proteins over time across multiple different experiments. And so by translating our observations into mathematical expression in this way, it becomes possible to test whether or not those observations can emerge from a program of genetic interactions.

Нужно было найти способ закодировать эту новую спецификацию. Допустим, вы занимаетесь исследованием генов в лаборатории, и вы обнаружили, что если ген А активен, то гены В или С тоже активны. Можно записать это наблюдение как математическое выражение, используя язык логики: если А, тогда В или С. Это очень простой пример, всего лишь для иллюстрации. Можно закодировать сложные выражения, отражающие поведение во времени различных генов и белков в разных экспериментах. Перевод наблюдений в математические выражения позволяет проверить, могут ли эти наблюдения быть результатом генетических взаимодействий.

And we developed a tool to do just this. We were able to use this tool to encode observations as mathematical expressions, and then that tool would allow us to uncover the genetic program that could explain them all. And we then apply this approach to uncover the genetic program running inside embryonic stem cells to see if we could understand how to induce that naïve state. And this tool was actually built on a solver that's deployed routinely around the world for conventional software verification. So we started with a set of nearly 50 different specifications that we generated from experimental observations of embryonic stem cells. And by encoding these observations in this tool, we were able to uncover the first molecular program that could explain all of them.

И именно для этого мы разработали один метод. С его помощью мы кодировали наблюдения в математические выражения, а потом выясняли, какая генетическая программа может всё это объяснить. Мы применяем этот метод, чтобы раскрыть генетическую программу внутри эмбриональных стволовых клеток и узнать, как привести эти клетки в «наивное» состояние. Этот метод был создан на основе программы, широко используемой для обычной проверки ПО. Мы начали с набора из 50 спецификаций, созданных нами на основании экспериментальных наблюдений над эмбриональными стволовыми клетками. Закодировав эти наблюдения, мы смогли обнаружить первую молекулярную программу, объясняющую их.

Now, that's kind of a feat in and of itself, right? Being able to reconcile all of these different observations is not the kind of thing you can do on the back of an envelope, even if you have a really big envelope. Because we've got this kind of understanding, we could go one step further. We could use this program to predict what this cell might do in conditions we hadn't yet tested. We could probe the program in silico.

Это само по себе достижение, не так ли? Вы не сможете сопоставить все эти наблюдения в спешке на клочке бумаге, даже если он очень большой. Поняв это, мы смогли продвинуться ещё на один шаг. Мы смогли использовать эту программу для предсказания действий клетки в ещё не изученных нами условиях. Мы смогли попробовать программу в компьютерной симуляции.

And so we did just that: we generated predictions that we tested in the lab, and we found that this program was highly predictive. It told us how we could accelerate progress back to the naïve state quickly and efficiently. It told us which genes to target to do that, which genes might even hinder that process. We even found the program predicted the order in which genes would switch on. So this approach really allowed us to uncover the dynamics of what the cells are doing.

Мы поступили так: мы выдвинули предположения и протестировали их в лаборатории. Мы обнаружили, что программа даёт очень предсказуемые результаты. Благодаря ей мы поняли, как ускорить превращение в «наивное» состояние быстро и эффективно. Мы узнали, на какие гены ориентироваться и какие гены могут задерживать этот процесс. Оказалось даже, что программа предсказала порядок, в котором гены будут включаться. Этот подход помог нам раскрыть динамику действий клеток.

What we've developed, it's not a method that's specific to stem cell biology. Rather, it allows us to make sense of the computation being carried out by the cell in the context of genetic interactions. So really, it's just one building block. The field urgently needs to develop new approaches to understand biological computation more broadly and at different levels, from DNA right through to the flow of information between cells. Only this kind of transformative understanding will enable us to harness biology in ways that are predictable and reliable.

Это подход можно применять не только к стволовым клеткам. Он позволяет понять вычисления, производимые клетками в контексте генетических взаимодействий. Это лишь кирпичик. Этой области срочно нужны новые подходы, чтобы понять биологические вычисления шире и на разных уровнях, начиная с ДНК и заканчивая информационным потоком между клетками. Только такое понимание даст нам возможность использовать биологию прогнозируемо и надёжно.

But to program biology, we will also need to develop the kinds of tools and languages that allow both experimentalists and computational scientists to design biological function and have those designs compile down to the machine code of the cell, its biochemistry, so that we could then build those structures. Now, that's something akin to a living software compiler, and I'm proud to be part of a team at Microsoft that's working to develop one. Though to say it's a grand challenge is kind of an understatement, but if it's realized, it would be the final bridge between software and wetware.

Но чтобы запрограммировать биологию, нам нужно создать инструменты и языки взаимодействия, которые бы позволили и экспериментаторам, и специалистам по теории вычислений конструировать биологические функции так, чтобы они составляли машинный код клетки, её биохимию, чтобы мы могли выстроить такие структуры. Это что-то сродни живому программному компилятору, и я горжусь тем, что работаю в группе Microsoft, занимающейся его созданием. Хоть и будет преуменьшением назвать это большим вызовом, но если это получится, это свяжет окончательно ПО с «мозгами» живых систем.

More broadly, though, programming biology is only going to be possible if we can transform the field into being truly interdisciplinary. It needs us to bridge the physical and the life sciences, and scientists from each of these disciplines need to be able to work together with common languages and to have shared scientific questions.

В широком смысле, программируемая биология будет возможно только тогда, когда мы сможем сделать эту область междисциплинарной. Нужно соединить физические и биологические науки, и учёные из этих сфер должны быть способны сотрудничать, используя общий язык и решая научные вопросы, интересующие и тех, и других.

In the long term, it's worth remembering that many of the giant software companies and the technology that you and I work with every day could hardly have been imagined at the time we first started programming on silicon microchips. And if we start now to think about the potential for technology enabled by computational biology, we'll see some of the steps that we need to take along the way to make that a reality. Now, there is the sobering thought that this kind of technology could be open to misuse. If we're willing to talk about the potential for programming immune cells, we should also be thinking about the potential of bacteria engineered to evade them. There might be people willing to do that. Now, one reassuring thought in this is that -- well, less so for the scientists -- is that biology is a fragile thing to work with. So programming biology is not going to be something you'll be doing in your garden shed. But because we're at the outset of this, we can move forward with our eyes wide open. We can ask the difficult questions up front, we can put in place the necessary safeguards and, as part of that, we'll have to think about our ethics. We'll have to think about putting bounds on the implementation of biological function. So as part of this, research in bioethics will have to be a priority. It can't be relegated to second place in the excitement of scientific innovation.

Это долгая перспектива, но нужно помнить, что многие корпорации ПО и технологии, используемые нами ежедневно, было невозможно представить себе в ту пору, когда мы начинали программирование на кремниевых микрочипах. Задумавшись о потенциале этой технологии, возможной благодаря вычислительной биологии, мы увидим те шаги, которые нам следует предпринять, чтобы претворить это в жизнь. Отрезвляет мысль о том, что подобными технологиями могут злоупотребить. Если мы говорим о возможности программирования иммунных клеток, то должны думать о возможных бактериях, созданных, чтобы заразить их. Кто-то может захотеть это сделать. Обнадёживает то, что — ну, учёных не очень, — что биология — деликатная штука. Программной биологией невозможно заниматься у себя в сарае. Так как мы только начинаем, то можем двигаться с широко раскрытыми глазами. Можем ставить трудные вопросы, заручиться необходимыми гарантиями и, соответственно, задуматься об этике. Нам надо задуматься об ограничениях при применении биологической функции. Исследования в области биоэтики должны стать приоритетом. Их нельзя считать второстепенными, воодушевляясь научными новшествами.

But the ultimate prize, the ultimate destination on this journey, would be breakthrough applications and breakthrough industries in areas from agriculture and medicine to energy and materials and even computing itself. Imagine, one day we could be powering the planet sustainably on the ultimate green energy if we could mimic something that plants figured out millennia ago: how to harness the sun's energy with an efficiency that is unparalleled by our current solar cells. If we understood that program of quantum interactions that allow plants to absorb sunlight so efficiently, we might be able to translate that into building synthetic DNA circuits that offer the material for better solar cells. There are teams and scientists working on the fundamentals of this right now, so perhaps if it got the right attention and the right investment, it could be realized in 10 or 15 years.

Но самый главная награда, конечная цель этого путешествия — прорыв во всех областях, от агрикультуры и медицины до энергетики и ресурсов, и даже в области компьютерной обработки данных. Представьте себе, когда-нибудь мы могли бы экологично снабжать планету зелёной энергией, если бы мы смогли повторить то, до чего давно додумались растения: использовать энергию солнца с эффективностью, несравнимой с солнечными батареями. Если бы мы поняли процесс квантовых взаимодействий, позволяющий растениями так эффективно использовать солнечный свет, мы смогли бы использовать его при построении синтетических сетей ДНК, чтобы улучшить солнечные батареи. Сейчас над этими работают группы учёных, так что если мы привлечём к этому инвесторов, то сможем осуществить это через 10–15 лет.

So we are at the beginning of a technological revolution. Understanding this ancient type of biological computation is the critical first step. And if we can realize this, we would enter in the era of an operating system that runs living software.

Мы на пороге технологической революции. Понимание элементарных биологических вычислений — важнейший первый шаг. Претворяя это в жизнь, мы начнём эру операционных систем, обслуживающих ПО живых систем.

Thank you very much.

Спасибо вам большое.

(Applause)

(Аплодисменты)