So a while ago now, I did a PhD, and I actually thought it would be quite easy to do research. Turns out it was really hard. My PhD was spent coding up neural network layers and writing CUDA kernels, very much computer-based science. And at that time, I had a friend who worked in a lab doing real messy science. He was trying to work out the structure of proteins experimentally. And this is a really difficult thing to do. It can take a whole PhD's worth of work just to work out the structure of a single new protein system.
And then 10 years later, the field that I was in, machine learning, revolutionized his world of protein structure. A neural network called AlphaFold was created by DeepMind that can very accurately predict the structure of proteins and solved this 50-year challenge of trying to do protein folding. And just two weeks ago, this won the Nobel Prize in chemistry. And it's estimated that since the release of this model, we've saved over a billion years of research time.
(Applause)
A billion years.
(Applause)
A whole PhD's worth of work is now approximated by a couple of seconds of neural network time. And to my friend, this might sound a bit depressing, and I'm sorry about that, but to me, this is just really an incredible thing. The sheer scale of new knowledge about our protein universe that we now have access to, due to an AI model that's able to replace the need for real-world experimental lab work. And that frees up our precious human time to begin probing the next frontiers of science.
Now some people say that this is a one-time-only event, that we can't expect to see these sort of breakthroughs in science with AI to be repeated. And I disagree. We will continue to see breakthroughs in understanding our real messy world with AI. Why? Because we now have the neural network architectures that can eat up any data modality that you throw at them. And we have tried and tested recipes of incorporating any possible signal in the world into these learning algorithms. And then we have the engineering and infrastructure to scale these models to whatever size is needed to take advantage of the massive amounts of compute power that we can create. And finally, we're always creating new ways to record and measure every detail of our real messy world that then creates even bigger data sets that help us train even richer models.
And so this is a new paradigm in front of us, that of creating AI analogs of our real messy world. This new AI paradigm takes our real, messy, natural world and learns to recreate the elements of it with neural networks. And why these AI analogs are so powerful is that it's not just about understanding, approximating or simulating the world for the sake of understanding, but this actually gives us a little virtual world that we can experiment in at scale to ultimately create new knowledge.
And you can imagine that this experimentation against our AI analogs, this can also happen in silico, in a computer with other agents, in a loop of in silico, open-ended discovery, ultimately to create new knowledge that we can take back out and change the world around us. And this isn't science fiction. Right now, we have thousands of graphics cards burning, training foundational models of our own micro-biological world, and then agents that are probing these AI analogs to design new molecules that could be potential new drugs.
And I want to show you exactly how this process works for us, because I believe it can serve as a blueprint to bring about a whole new wave of the future of AI-driven scientific and technological progress.
Now drug design is such an important area to focus on because it's actually becoming harder and harder to design new drugs. This is a graph of the number of new drugs created per billion dollars of R and D spent over time. And what you can see is that the number of new drugs is exponentially decreasing. It's becoming more and more expensive to create a new drug. Now during this same time period, we've had a huge amount of advancement in the capabilities of AI, driven by a whole host of algorithmic breakthroughs. But one of the secret sources of this advancement in AI has also been that of Moore's Law, that the amount of computing power has just been exponentially increasing over time. And these days, it perhaps isn't Moore's Law that we should care about, but Jensen's law. Jensen Huang, being the CEO of Nvidia, for the exponential increase in GPU FLOPS that are now powering our neural networks.
So really the question is, how do we bring this world of AI and machine learning to that of drug design? Can we think about using our AI analogs to reverse this curse of Eroom’s law and jump on this exponential wave of GPU FLOPS powering our neural networks? Actually bringing these worlds together and driving this change is the day-to-day responsibility that I feel.
So how can we go about modeling biology? Well if we were in the world of physics, for example, modeling the universe, then we can actually write down a lot of the theory by hand with maths and very accurately predict, for example, the unfolding of the universe, even millions of light years away. But we can't do that for the incredibly complex dynamics within ourselves. We can't just write down some equations for ourselves. We can perhaps write down the theory of how atoms interact. That's physics. But then simulating these interactions on the scale of trillions of atoms within our cells is just completely unfeasible. And then we haven't worked out how to describe these complex dynamics in coarser and simpler terms that we could write down with maths. It’s just crazy to think that we can model the universe so far away but not the cells at our fingertips.
But AI and machine learning can be the perfect abstraction for a biological world. Using the snippets of data that we can record from our cells, we can then learn the equations and theories and abstractions implicitly within the activations of our neural networks. In fact, our company is called Isomorphic Labs. Isomorphic because we believe there is an isomorphism, a fundamental symmetry, that we can create between the biological world and the world of information science, machine learning and AI.
So to see how we are using these AI analogs today, I want to dive into the body and have a look into cells and think about proteins. Now proteins are one of the fundamental building blocks of life. And these proteins carry different functions in the body. And if we can modulate the function of a protein, then we are well on our way to creating a new drug. Proteins are made up of a sequence of amino acids, and there are about 20 different amino acids, each one here depicted by a different letter. An amino acid is a collection of atoms, a molecule, and these molecules are joined together into a linear sequence. And the function of a protein is not just due to the sequence of these proteins, but also due to the three-dimensional shape that these proteins fold up into. And there are thousands of proteins inside of us, each with their own unique sequences and their own unique 3D shape. And remember, trying to work out experimentally that 3D shape can take months or even years of lab work.
But with the breakthrough of AlphaFold and AlphaFold 2 in 2020, we now have a model that can take the sequence of amino acids as input and then very accurately predict the 3D structure of a protein as the output. And this allows us to actually fill in the gaps of our known protein universe. It's our AI analog of proteins.
So proteins carry their function. But these proteins, they don't actually act in isolation. They're part of bigger molecular machines with these proteins interacting with other proteins as well as other biomolecules like DNA, RNA and small molecules.
For example, let's zoom in and have a look at this protein. This is a protein that repairs DNA, and it interacts with DNA clamping down on it, helping facilitate repair and then the repaired DNA is released back out to the cell. Now in drug design what we want to do is either make molecular machines work better or actually stop them from working. And in this case, for cancer, we actually want to stop this particular DNA repair protein from working, because in cancerous cells there is no backup DNA repair mechanism. And so if we stop this one working, then cancerous cells will die, leaving just healthy cells remaining.
So what would a drug actually look like for this protein? Well a drug is something that comes in and modulates a molecular machine. And this could be a drug molecule that goes into the body, goes into the cell and then sticks to this protein just over here. And this drug molecule actually glues the DNA repair proteins clamp shut, so it can't do effective DNA repair causing cancerous cells to die and leaving just healthy cells remaining. Now to design such an amazing drug molecule completely rationally, we'd have to understand how all of these biomolecular elements come together. We would need an AI analog of all and any biomolecular systems.
Earlier this year, we had a breakthrough. We developed a new version of AlphaFold, called AlphaFold 3, that can model the structure of almost all biomolecules coming together with unprecedented accuracy. This model takes as input the protein sequence, the DNA sequence and the molecule atoms. And these inputs are fed to a neural network that has a large processing trunk based on transformers. Now unlike a large language model that operates on one- dimensional sequences, instead, our model uses what’s called a “pairformer” and operates on a 2D interaction grid of the input sequence. And this allows our model to explicitly reason about every pairwise interaction that could occur in this biomolecular system. And so we can use the features of this processing trunk to condition a diffusion model.
Now you might know diffusion models as these amazing image generative models. Now just like diffusing the pixels in an image, instead, our diffusion model diffuses the 3D atom coordinates of our biomolecular system. So now this gives us a completely malleable virtual biomolecular world. It’s our AI analog that we can probe as if it’s the real world. We can make changes to the inputs, changes to the molecule designs and see how that changes the output structure.
So let's use this model to design a new drug for our DNA repair protein. We can take a small molecule that's been recorded to stick to this protein and make changes to its design. We want to change the molecule design so that this molecule makes more interactions with the protein, and that will make it stick to this protein stronger. And so you can imagine that this gives a human drug designer a perfect game to play. How do I change the design of this molecule to create more interactions? Now normally, a drug designer would have to wait months to get results back from a real lab at each step of this design game. But for us, using this AI analog, this takes just seconds. And this is the reality of what our drug designers back in London are doing right now.
So we have this beautiful game that's being played by our drug designers, who are using this AI analog of biomolecular systems to rationally design potential new drug molecules. But you can imagine that we don't have to just limit this game to human drug designers Earlier in my career, I worked on training agents to beat the top human professionals at the game of StarCraft. And we created game-playing agents for the games of Go and Capture the Flag. So why can't we create agents that instead play the game that our human drug designers are playing? So now our AI analog becomes the game environment, and we can train agents against that. And we already have some incredibly powerful agents that are already doing this today.
Now in this setup, all of the drug design is happening on a computer. So what happens if we have access to many, many computers? Well instead of having one human drug designer working on some new molecule designs, instead, we can have thousands of agents doing molecule design in parallel. Just imagine what impact that could have on patients suffering from a rare type of cancer, the speed that we could get to a potential new molecule to address this medical need or the ability to go after many diseases in parallel. Cancer is often caused by mutations of proteins, and even within the same type of cancer, each patient can have different mutations. And that means that one drug molecule won't work for all patients. But what if we could go in and measure each individual patient's protein mutations, and then have a whole team of molecule-design agents working on that individual's protein mutations? Then we could create a molecule tailored for each individual patient.
I'm showing just this. Here the protein is randomly mutating, and each mutation in red subtly changes the 3D shape of this protein. And we're able to generate molecules that should stick to this protein in response to these changes. Now this is still far away from patients, and there's a huge amount of complexity in drug design left to tackle, but this really does give us a glimpse at the future that is to come.
So we've seen how this new AI paradigm is driving our progression in drug design. And you can also see this paradigm being played out in material science, in creating new forms of energy and in chemistry. The ability to take our real messy world and then create our own AI analogs to then on a computer do open-ended scientific discovery to create new knowledge that we can take back out and change the world around us. This is an incredibly powerful paradigm, and one that will bring about a whole new wave of scientific and technological advancements. And we’re going to need as many people as possible, especially those working in machine learning, AI and technology, to help drive this new wave of progression.
Thank you.
(Applause)