For our entire lives, we’ve been living through a computing revolution. Many of you here in this room have contributed to it, with leaps forward in compute, connectivity, mobile technologies, and now AI.
For my part, I've dedicated my entire career to augmented reality, fusing the real world with computing experiences. I started this journey about 25 years ago for my PhD. It might sound groundbreaking, but check out these early prototypes. The technology was really primitive back then, but the potential felt limitless when we were experimenting in that university lab.
What I didn't know at that time was many of the fundamental innovations for my work would come from AI researchers in a different lab. While I was helping computers see the world, these AI researchers were helping computers reason about the world. Since then, large language models and multimodal reasoning have enabled richer language and image understanding. These models are now fast enough for live conversations where the AI can act on your behalf and, most importantly, with your permission. And augmented reality and virtual reality has moved computing from the rectangular screen to the 360 immersive display to now even the world itself becoming the display. We now refer to this broad collection of experiences as extended reality or XR.
Until now, these innovations have occurred separately and in silos. Well here comes act two of the computing revolution. AI and XR are converging, unlocking radical new ways to interact with technology on your terms. Computers will become more lightweight and personal. They will share your vantage point, understand your real-world context and have a natural interface that's both simple and conversational. Countless people and companies are innovating in this space, including many on the TED stage this week.
We're excited to contribute to this momentum with Android XR. It's an operating system we're building with Samsung that brings XR hardware together with Gemini, our AI assistant, to augment human intelligence. It can support a broad range of devices, from headsets to glasses to form factors we haven't even dreamed of yet.
OK, let me show you where we’re heading by inviting my colleague Nishtha to the stage.
Welcome, Nishtha.
(Applause)
Nishtha Bhatia: Hi.
Shahram Izadi: These ordinary-looking glasses are packed full of technology. A miniaturized camera and microphones give the AI the ability to see and hear the world. Speakers let you listen to the AI and play music or even take calls. And these glasses also have a tiny, high-resolution in-lens display that's full color that I'm holding in my hand. It's very, very small. These glasses work with your phone streaming back and forth, allowing the glasses to be very lightweight and access all your phone apps. And if you're wondering, I'm actually wearing the glasses too. They're actually helping me see all of you in the audience because they have prescription lenses inside them. But they’re also displaying my speaker notes for me as well.
(Laughter)
For this demo, you'll see what Nishtha is seeing on the screen behind her. And this is the very first time we're showing these glasses in public. So this is going to be a live demo of conceptual hardware, early software. What could go wrong? Nishtha, best of luck. No pressure.
NB: Amazing. Alright, let’s just make sure that these glasses are shown on the screen behind us. OK.
[Glasses screen off]
Awesome.
(Applause)
Now I'm going to launch Gemini.
Gemini: Hello there.
NB: Hi, Gemini. Can you start us off with something fun and creative, and write a haiku for what I'm seeing right now?
G: Sure. Faces all aglow. Eager minds await the words. Sparks of thought ignite.
SI: Some anxious but happy faces as well, yeah. As you can see, the AI sees what Nishtha sees, hears what she hears and is responding in real time. But that’s just table stakes. I think everyone in this audience has seen demos like this before. Let's actually step it up a notch with something we call Memory. For a rolling contextual window, the AI remembers what you see
without having to be told what to keep track of.
NB: So you guys may have noticed I snuck a peek back at the shelf a moment ago. I wasn't paying attention, but let's see if Gemini was. Hey, did you happen to catch the title of the white book that was on the shelf behind me? G: The white book is “Atomic Habits” by James Clear. NB: That is absolutely right. So let's try something harder. I keep losing my hotel key card. Do you know where I last left the card? G: The hotel key card is to the right of the music record.
NB: Great.
SI: For someone as forgetful as me, that's a killer app. OK, this AI is inherently multimodal. So it's using visual data and natural language to digest complex queries in real time while remembering your past context. Nishtha, let's see some more. NB: Yeah, let’s bring this book back. So I don't really have time to read it right now. But I wonder if Gemini might be able to help me summarize what's going on. Like, I don't know. What does this diagram mean? It looks complex.
G: This diagram is labeled “The Habit Line” and illustrates the process of habit formation. It shows how automaticity increases with repetitions until a habit is formed.
NB: OK.
SI: Now imagine being able to communicate and understand any language. The AI can shift its output to speak different languages with dialects and accents that feel human and natural.
NB: Yeah, let’s go to this sign over here. Can you translate this to English for me? G: The sign states: “Private property, no trespassing.”
SI: OK, let's prove this is a live demo. Does anyone in the audience have a suggestion for a different language to translate to?
Audience: Farsi.
SI: Farsi. We tried Farsi, it failed last time, but let's try it again.
NB: Do you want to try Farsi?
SI: Because I do speak Farsi. It's my mother tongue, so thank you.
NB: Gemini, let’s just give this a shot. Can you translate this sign to Farsi for us?
G: (Speaking Farsi)
SI: Great, awesome. It speaks Farsi.
NB: That’s amazing. So, as Shahram mentioned, you all may have seen translation demos like this before, but what's new now is that in addition to just saying things in a different language, I can also speak to Gemini in another language. I know Hindi, so let's give this a shot.
(Speaks Hindi)
G: (Speaks Hindi)
NB: So Gemini said you all look focused and excited, and it has a better accent than I do.
(Laughter)
SI: Alright, now let's see how the AI can connect the physical world with your digital content and take action.
NB: Yeah, let’s get some music going in here. OK, Gemini, why don't you play us a track from this record?
GB: Sure, here’s “Bad Dreams” by Teddy Swims.
(Music)
SI: Perfect. In a few seconds, the AI recognized the album, looked up the tracklist and took action using the phone apps.
OK, Nishtha, it does look like the demo gods are with us. Maybe with you more than me. Let's do one last demo I know you're keen to give.
NB: Yes, this is my first time in Vancouver, and I love going on walks. So why don't you navigate me to a park nearby with views of the ocean?
G: OK, I am starting navigation to Lighthouse Park, which has magnificent views of the Pacific Ocean. Is there anything else I can assist you with?
NB: Honestly, with these directions and a 3D map, I should be all set, and hopefully I won’t look like a tourist.
Thank you all.
(Applause)
SI: Thank you, Nishtha, that was awesome. OK, we've seen glasses. Now let's turn our attention to the other side of the spectrum: headsets. You've seen these types of devices before, but when we first brought AI to a headset, it completely caught me by surprise. For this demo, we're going to use the Project Moohan headset that Samsung is launching later this year. Compared to glasses, headsets give you an infinite display for getting work done or immersing yourself in a movie, or maybe a TED Talk at some point. Let me bring up my colleague Max to show us even more. Hey, Max.
Max Spear: Hello. (Applause) SI: And the same thing is going to go here. You'll see exactly what Max is seeing behind on the screen. Go for it, Max. MS: Let’s do it. You'll notice we start grounded in the real world. And I'm going to control the entire system with my eyes, hands and voice. But where things get really interesting is when we invite Gemini in as this conversational companion that can come with us anywhere.
G: Hello.
MS: Hey, Gemini, can you bring up my trip planner for me, please? G: Of course. Opening up your trip planner. MS: Cool, but I left these windows really disorganized. Can you help with that?
G: Of course I can help with that.
SI: No clicks, no keyboards. It's just a conversation. And the AI is taking action. OK, some more audience participation. Someone shout out a name of a place you want to visit.
Audience: Melbourne.
Audience: Cape Town.
SI: Let's go to Cape Town. Max: OK, sounds fun. Can you please take me to Cape Town? G: Certainly. Let me help with organizing the windows. SI (Laughs) MS: Awesome. And can you also take me to Cape Town?
(Laughter)
G: I can certainly do that. Let me take you to Cape Town. MS: And we’re very organized as we go there, perfect.
SI: As you can see, the AI is taking Max's requests, figuring out how best to answer it, opening up the Maps app. And from there, he can actually explore anywhere in the world in this 3D view.
MS: OK, this is pretty incredible. Viewing the world from this angle, I can even zoom in to city levels. But what's really interesting is having an AI here who can see what I see. Can you tell me more about the significance of this place? G: I can indeed provide information about Table Mountain. Table Mountain holds a profound significance, deeply rooted in both its natural splendor and its rich cultural history. It has long been a spiritual and cultural symbol for the Indigenous Khoekhoe and San people.
MS: OK, very good, Gemini. I'll come back to this later. It also does look like a table, that's handy.
SI: OK, Max, let's bring this closer to home. How about exploring winter sports, seeing that we're in Vancouver.
MS: OK, let’s get some inspiration for Shahram. Can you please show me 360 videos of this?
[Powder snowboarding]
G: Sure, here are some 360-degree powder snowboarding videos.
MS: OK, I’m kind of new to snowboarding, so hopefully this video will help me learn. Although this looks already way more intense than I think I'm ready for. What was the name of that trick? G: That’s a backside 540 with a grab.
SI: The AI knows exactly what Max is looking at, even in rich video content, and can answer any question in real-time. Max, are you sure this is Whistler?
MS: You know, Shahram, I actually have no idea. I've never been here before, but let's find out. What run do you think this is?
G: It’s probably one of the ungroomed runs in summit bowl or the top section of West Ridge on Mount Bachelor, as they are close to the summit and face north towards Three Sisters and Broken Top, the prominent mountains visible in the background.
MS: OK, Mount Bachelor, Oregon, got it.
SI: OK, let's have a little bit of fun now. Let's have AI narrate this video. Somebody in the audience, shout out a fun narration style like, nature documentary or something like that.
Audience: Morgan Freeman.
SI: We can't do actors, but we'll get to that at some point. Any other suggestions?
Audience: Movie trailer.
Audience: Horror movie.
SI: Horror movie. That's going to be -- it might be gory, but let's go for it.
MS: Slightly dark, but let’s do it. Can you please describe what you're seeing as if you're an overly enthusiastic character in a horror movie?
G: Oh, darling, this is simply marvelous. A desolate mountain scape. How delightfully ominous. Prepare for a chilling dance with the unknown. Where every gust of wind whispers tales of icy terror and lurking shadows.
MS: OK, Gemini, that’s great. Return to more positive things another day.
SI: OK, Max, I know there's one last demo, and you've been apparently practicing your farming skills?
MS: Yes, Shahram, I’ve been excited to get back to my rural roots and also embrace one of my favorite things, which is playing games. But it becomes a whole new, next-level experience when you have an AI by your side, both to help keep you company as you play, but also to help you learn how to play these kind of games in the first place. So, for example, what do you think I should do next here? G: It looks like this is early in the game in “Stardew Valley.” You need to earn money, so try planting some seeds. First, use your hoe to till the soil, and then plant your seeds. It looks like you have some parsnips. Since it isn't raining, you'll also need to water them with your watering can.
MS: Seems like a lot, and maybe robots will do this for us one day, but for now, I'm going to have to return back to my farm later. For one last thing, can you please give a playful two-sentence summary of all the ground we covered today?
G: We soared from organizing a digital realm to traversing snowy mountains, and even found time to farm some virtual parsnips. Quite the adventure today.
MS: Quite the adventure, indeed. Thanks, Gemini.
(Applause)
SI: Fantastic, thank you Max. We're entering an exciting new phase of the computing revolution. Headsets and glasses are just the beginning. All this points to a single vision of the future, a world where helpful AI will converge with lightweight XR. XR devices will become increasingly more wearable, giving us instant access to information. While AI is going to become more contextually aware, more conversational, more personalized, working with us on our terms and in our language. We're no longer augmenting our reality, but rather augmenting our intelligence.
Thank you so much.
(Applause)