Chris Anderson: So Tom, your company became prominent on the internet with the release of a fake Tom Cruise video, DeepTomCruise, that I think attracted like, a billion views on TikTok and Instagram. Which leads me to my first question, which is, please, can we, at TED, have our own Tom Cruise video, please?
Tom Graham: You know, I thought you might ask. So a little earlier, we had a crack. Maybe we'll have a look.
CA: Let's have a look.
DeepTomCruise: What's up, internet? I'm north of the border, eh?
(Laughs)
At the TED conference. It's not short for Theodore, but nobody calls me Thomas, so it's cool. It's Tom at TED. Yes I ... Canada.
(Laughs)
Seriously, though, everybody here, very nice, very polite. Especially the whales.
(Laughter)
CA: I mean, how did you do that?
TG: So really at Metaphysic, we specialize in creating artificially generated content that looks and feels exactly like reality. So we take kind of real-world data, we train these neural nets, and it can, more accurately than VFX or CGI, really create this content that looks and feels so natural. And so that is a great example of the AI being kind of prompted by the natural performance of a person and kind of, the face goes on top.
CA: And it helps, the fact that your co-founder is, you know, he's a pretty good Tom Cruise impersonator.
TG: Indeed, Miles Fisher is of a foremost Tom-Cruise- but-not-Tom Cruise, yeah.
CA: I think you have another example as well. Can we see that one?
TG: Yes, absolutely. Going kind of beyond faces now, talking about voice.
[Original Spanish vocal performance]
[Audio-to-face synchronization]
[AI-generated vocal and visual performance]
CA: Tell us what's happening there.
TG: So what you see there is really the singing voice of a lady singing Spanish and Aloe Blacc, who sings "Wake Me Up," the Avicii song that he wrote with Avicii. We are transporting her voice across to his face so he doesn't sing in Spanish. And then suddenly we transported again her voice into his voice. So anybody in the future will be able to speak any language. It'll look perfectly natural. And this content is becoming more and more easy to create, and eventually it will end up with a scale where we will all be kind of, main characters in our own content on the internet.
CA: OK.
(Laughter)
Before we dig into that just a bit, I mean, you’ve shown us recorded video there. Rumor has it you can also do this with live video. Can that be right?
TG: Yes, we can do it live real-time. And this is like, really at the cutting edge of what we can do today, moving from offline processing to where processing it so fast that you can do it in real-time.
CA: Well I'm going to challenge you and your team to do a bit of a first here. There's video of you right up on that screen. Show us something surprising you can --
(Laughter)
Oh, my gosh.
TG: So there we go. This is, you know, a live, real-time model of Chris on top of me, running in real-time.
CA: And next, you'll tell me that it can ...
(Laughter)
Oh, Lord. I am so uncomfortable with this.
(Laughter)
I am so uncomfortable. Can it do voice as well?
TG: Um.
(Live AI-generated Chris Anderson's voice) We think it can, we're really pushing the limits of AI technology now. And I’m talking exactly as I would as Thomas Graham, but it’s coming out is the one and only of Chris Anderson.
CA: I'm deeply sorry everyone to subject you to this. You know, there's something possibly even worse. So to come clean on this, they took some shots of me a couple of weeks ago doing different facial expressions, so they captured a video model. And it turns out, I discovered this week, that they can apply that, not just to Tom, but to anyone.
And so, my dear friend Sunny Bates is here in the front row. Sunny, do you consent to channel inner me for a minute? Can we try that? I'm really worried about this.
(Laughter)
Do we have Sunny on screen? Oh, there's Sunny. Oh, no, oh, no.
(Laughter) Oh, no.
(Laughter)
TG: Chris, you look amazing.
CA: All right, enough of that, cut that right now.
(Sunny Bates mouthing) More, more!
TG: Yes, more, more of this.
CA: Tom, Tom, Sunny Bates is the woman who introduced me to TED. Without Sunny Bates, none of us would actually be here now. And we reward her with this?
TG: And now you've finally become the master, you know?
CA: Oh, so look. OK, amazing. It obviously occurs to everyone in this room that there are some things that can go horribly wrong with this.
(Laughter) And, you know, we've already seen examples online of, you know, we've had photographs of Trump being arrested that were unreal, there could be video of it. There’s pornography that can use the faces of celebrities. All these things that we’ve seen, deepfake. How do you feel about the downside of this technology?
TG: So personally, you know, we build this stuff and I'm worried, right? Worried is the right instinct for everybody to have. And then beyond that, think about, you know, what can we do to prepare ourselves? How can we try to impact the future as it spirals in this direction, where as individuals, it's going to be kind of difficult to understand what's computer-generated and what's real. And so there are things that we can do there. Raising public awareness of manipulated media. That's one, you know, this is a great forum for that.
CA: Will you claim to me that if you were to shut down your company right now, it wouldn't stop the problem of deepfake videos because the technology's out there, that that's going to happen anyway, that that's not within your control?
TG: Yes. We're talking about content that is so compelling. If we put any of ourselves inside content and maybe it is talking to our loved ones or just talking to our friends on the beach and it's so realistic, it looks real, it's so compelling, everybody is trying to create this content today. All of the GPUs in the universe are driving at trying to create this. So it doesn’t matter what any one person does. This will happen and it’s happening very, very quickly.
CA: So, I mean, we'll talk about the upside in a second. But it seems like we are going to have to get used to a world where we and our children will no longer be able to trust the evidence of our eyes.
TG: I think so. We are going to have to understand a new set of institutions to verify what is authentic media. But then we can begin to lean into some of the creative things that happen from it. And there are benefits that come with that too. So it'll be an accommodation.
CA: So talk a bit about the benefits. I mean, obviously, on the entertainment side, there's an amazing number of possibilities. You have Tom Cruise, we can have [“Mission: Impossible] 273” in the year 2150, right? Like, you will be with us forever.
TG: We're working on number 75,001 right now, yeah.
CA: I guess that is kind of amazing. Like, we love lots of people in the world, and we want the possibility that, with their permission, we can do more with them. Talk about some other possible upsides.
TG: What we've seen in kind of building this content and watching people interact with it, especially if they're kind of, interacting with themselves, maybe it's their younger 20-year-old self or maybe they're interacting with their partner, but the young version of their partner, there's this tremendous emotional connection that comes from the very, very photorealistic beyond the uncanny kind of content. And so if we can start deploying that among regular people, if we can scale it up so that it works for any kind of person, then we can begin to kind of, you know, have more interesting, meaningful, human kind of interactions and relationships online. And since the pandemic, we all spend more time online. But it's chat and it's email. If we could get more human emotion, more feeling, there's a lot that we can do with that, right?
And so, education is a good example. We could have an inspiring teacher in thousands of classrooms around the world, speaking every language in the world at the same time, and students could interact with each other in a way that goes beyond Zoom. There can be real cultural exchange, real socialization. There's a lot that we can do beyond here.
CA: So that teacher example is powerful to me, like, the fact that a single teacher could turn a written lesson into video in any number of languages and extend indefinitely. That seems like a real amplification potentially of good human intent. I'm excited by that. I still don't get the family side of it. Like, if you want to have a human connection with someone in your family, like, won't people just be creeped out by, "Oh my God, I was just looking at your avatar, I thought it was you." You know, like, isn't that just creepy?
TG: I think there's definitely a creepy element to this, right? And then you go beyond that and the creepiness drops away and the medium drops away, and it's the content and the connection that's there. So, you know, I imagine that, you know, if I collect data from my grandparents who are very old today, then in the future I'll be able to relive experiences with them and communicate with them. And that, it’s not going to be any good for them, right? They're probably going to have passed on, but for me, it'll help me process who I am, my relationship with them. That idea of kind of, decoupling human experience, both from where it happens and the moment in time that it happens, I think that we can create these experiences through hyperreal, photorealistic media that allow us to share the best of our experiences, the best of who we are.
CA: I'm going to be very curious to see who can feel that right now. My guess is that there's going to have to be lots of experiments and a lot of things that are going to creep us out. And maybe we'll find some things that are just absolutely incredible. But I have an important question for you, which is how the hell do I control me now? You've got me on video. What's to stop me being misused across the internet now?
TG: That's right. I think that, as we kind of, allow companies to create these realistic experiences, as individuals, we need to be empowered to own our real-world data, the data used to train the algorithms, and we need to control how our photorealistic avatars are created and where they're used.
So to this extent, I was looking at kind of conventional, current legal institutions to see what we could do to create new rights. So I created a photorealistic avatar of myself, submitted it to the US Copyright Office to see if they would register my copyright in it, and this is what the video looked like.
(Video) TG: Here is the AI, realistic version of myself. Even if the appearance of this AI representation of myself may change cosmetically, or if I change my hair or add creative features, my intention is to create this AI version of myself that embodies the essence of who I am as a person under any circumstance.
CA: Wow. So I think you have shown us what is going to be a repeated theme in this conference, which is that future is going to be weird and wonderful at the same time. Quite what the balance is between those two, TBD. But this is a world where each of us is going to have to think differently about who we are and claim these rights. What you're saying is that if people have this right, you can picture a world where, for example, if a video goes viral on YouTube without your consent, you'll be able to take it down because of some link back to this.
TG: That's right. I think the most important thing that we can do when we're talking about data from a real world being able to power these things is that we need to own property rights over the data. We shouldn't sign it over to companies through terms of service, we shouldn't give it away. If you fundamentally own it, then you can be in control. And you're at the ground level of all of the economies and all of the use cases, that are going to spin up through history. It's a lot, but we're on the way now.
CA: Tom Graham, thank you so much for sharing this technology at TED. And Sunny Bates, I'm so sorry.
(Laughter)
(Applause)