Bilawal Sidhu: The AI-powered tools supercharging your imagination

All right. Good afternoon, y’all. Let's talk about blending reality and imagination. But first, let's take a step back in time to 2001. As an 11-year-old in India, I became obsessed with computer graphics and visual effects. Of course, at that age, it meant making cheesy videos kind of like this. But therein started a foundational theme in my life, the quest to blend reality and imagination. And that quest has stayed with me and permeated across my decade-long career in tech, working as a product manager at companies like Google and as a content creator on platforms like YouTube and TikTok.

Bene, buon pomeriggio a tutti. Parliamo di fondere realtà e immaginazione. Ma prima, torniamo indietro nel tempo fino al 2001. In India, all’età di 11 anni, ero ossessionato dalla computer grafica e gli effetti speciali. Certo, a quell’età, significava creare video dozzinali tipo questo. Ma da lì è partito un tema fondamentale della mia vita, la missione di unire realtà e immaginazione. E quella missione mi è rimasta accanto e ha permeato la mia carriera decennale in informatica, come product manager per aziende come Google e creatore di contenuti su piattaforme come YouTube e TikTok.

So today, let's deconstruct this quest to blend reality and imagination and explore how it’s getting supercharged -- buzzword alert -- by artificial intelligence. Let's start with the reality bit.

Quindi oggi decostruiremo questa missione di fondere realtà e immaginazione ed esploreremo come viene potenziata -- termine in voga in arrivo -- dall’intelligenza artificiale. Partiamo dalla realtà.

You probably heard about photogrammetry. It's the art and science of measuring stuff in the real world using photos and other sensors. What required massive data centers and teams of experts in the 2000s became increasingly democratized by the 2010s. Then, of course, machine learning came along and took things to a whole new level with techniques like neural radiance fields, or NeRFs.

Avrete sentito parlare della fotogrammetria. È l’arte e la scienza di misurare cose nel mondo reale usando foto e altri sensori. Negli anni 2000 servivano enormi centri di raccolta dati e team di esperti mentre nel decennio 2010 il processo diventa molto più accessibile. Poi è arrivato l’apprendimento automatico e ha portato il tutto al livello del tutto nuovo con tecniche come i neural radiance fields, o NeRF.

What you're seeing here is an AI model creating a ground-up volumetric 3D representation using 2D images alone. But unlike older techniques for reality capture, NeRFs do a really good job of encapsulating the sheer complexity and nuance of reality. The vibe, if you will.

Qui potete vedere un modello di IA che crea una rappresentazione volumetrica 3D dal basso verso l’alto usando solo immagini 2D. Rispetto a tecniche più vecchie per la cattura della realtà, i NeRF se la cavano molto meglio a incapsulare la realtà in tutti i minimi dettagli. L’atmosfera, se vogliamo.

Twelve months later, you can do all of this stuff using the iPhone in your pocket, using apps like Luma. It's like 3D screenshots for the real world. Capture anything once and reframe it infinitely in postproduction, so you can start building that collection of spaces, places and objects that you truly care about and conjure them up in your future creations.

Dodici mesi dopo potete fare tutto ciò con l’iPhone che avete in tasca, grazie ad app come Luma. È un po’ come schermate 3D per il mondo reale. Cattura qualcosa una volta e lo ripete all’infinito in postproduzione per iniziare a costruire la raccolta di spazi, luoghi e oggetti di cui vi importa sul serio e poi posizionateli nelle vostre creazioni future.

So that's the reality bit. As NeRFs were popping off last year, the AI summer was also in full effect, with Midjourney, DALL-E 2, Stable Diffusion all hitting the market around the same time. But what I fell in love with was inpainting. This technique allows you to take existing imagery and augment it with whatever you like, and the results are photorealistically fantastic. It blew my mind because stuff that would have taken me like three hours in classical workflows I could pull off in just three minutes.

Sulla realtà, questo è quanto. Durante il periodo dei NeRF l’anno scorso, eravamo anche nel mezzo dell’estate dell’IA, con Midjourney, DALL-E 2 e Stable Diffusion arrivati sul mercato tutti nello stesso periodo. La tecnica che mi ha affascinato di più è l’inpainting. Questa tecnica permette di prendere immagini già esistenti e aggiungerci qualsiasi cosa, con risultati incredibilmente verosimili. Mi ha sconvolto perché certi effetti che normalmente avrei ottenuto con tre ore di lavoro ora li potevo creare in tre minuti.

But I wanted more. Enter ControlNet, a game-changing technique by Stanford researchers that allows you to use various input conditions to guide and control the AI image generation process. So in my case, I could take the depth information and the texture detail from my 3D scans and use it to literally reskin reality.

Ma volevo di più. Ecco ControlNet, una tecnica rivoluzionaria creata da ricercatori di Stanford che permette di usare varie condizioni di input per guidare e controllare il processo di generazione immagini dell’IA. Quindi nel mio caso potrei usare le informazioni sulla profondità e i dettagli delle superfici dalle mie scansioni 3D e usarli letteralmente per rivestire la realtà.

Now, this isn't just cool video. There’s a lot of useful use cases, too. For example, in this case I'm taking a 3D scan of my parents' drawing room, as my mother likes to call it, and reskinning it to different styles of Indian decor and doing so while respecting the spatial context and the layout of the interior space. If you squint, I'm sure you can see how this is going to transform architecture and interior design forever.

Ma non è solo un bel video. Ci sono anche molti usi pratici. Per esempio, qui faccio una scansione 3D della stanza da disegno dei miei genitori, come ama chiamarla mia madre, e la rivesto di diversi motivi indiani, rispettando al contempo il contesto spaziale e la disposizione degli oggetti nella stanza. Strizzando gli occhi di sicuro vedete come trasformerà per sempre l’architettura e il design di interni.

You could take that 2016 scan of a Buddha statue and reskin it to be gloriously golden while pulling off these impossible camera moves you just couldn't do any other way. Or you could take that vacation footage from your trip to Tokyo and bring these cherry blossoms to life in a whole new way. And let me tell you, cherry blossoms look really good during the day, but they look even better at night. Oh, my God. They sure are glowing.

Potreste prendere una scansione del 2016 di una statua del Buddha, restituirle il suo dorato splendore originario e al contempo ottenere queste angolazioni impossibili non realizzabili in altri modi. O potreste prendere i video della vostra vacanza a Tokyo e riportare in vita quei ciliegi in un modo completamente nuovo. Fatemelo dire, i fiori di ciliegio sono bellissimi di giorno, ma sono ancora meglio di notte. Mio Dio. Guardate come brillano.

It's almost like this dreamlike quality where you can use AI to accentuate the best aspects of the real world. Natural landscapes look just as beautiful. Like this waterfall that could be on another planet. But of course, you could go over the hills and far away to the French Alps from another dimension.

È quasi una qualità da sogno che ti permette di usare l’IA per evidenziare il meglio del mondo. I paesaggi naturali sono altrettanto belli. Come questa cascata che potrebbe trovarsi su un altro pianeta. Ma certamente potreste andare lontano, sopra le colline, fino alle Alpi da un’altra dimensione.

But it's not just static scenes. You can do this stuff with video, too. I can't wait till this technology is running at 30 frames per second because it's going to transform augmented reality and 3D rendering. I mean, how soon until we're channel-surfing realities layered on top of the real world?

Ma non si tratta solo di scene statiche. Potete farlo anche con i video. Non vedo l’ora che questa tecnologia raggiunga i 30 frame al secondo perché trasformerà la realtà aumentata e il rendering 3D Voglio dire, quanto manca per poter esplorare realtà stratificate sul mondo reale?

Of course, just like reality capture got democratized, all these tools from last year are getting even easier. So instead of me spending hours weaving together a bunch of different tools, tools like Runway and Kaiber let you do exactly the same stuff with just a couple clicks. Want to go from day to night? No problemo. Want to get that retro 90s aesthetic from "Full House"? You can do that too.

Così come la cattura della realtà si è democratizzata, anche questi strumenti dell’anno scorso si semplificano. Quindi invece di passare ore a collegare un mucchio di strumenti diversi, programmi come Runway e Kaiber ti fanno fare la stessa cosa con solo un paio di click. Passare dal giorno alla notte? No problem. Ottenere l’estetica retrò anni ’90 de “Gli amici di papà“? Potete fare anche quello.

But it goes beyond reality capture. Companies like Wonder Dynamics are turning video into this immaculate form of performance capture so you can embody fantastical creatures using the phone in your pocket. This is stuff that James Cameron only dreamt about in the 2000s. And now you could do it with your iPhone? That’s absolutely wild to me.

Ma va oltre la cattura della realtà. Aziende come Wonder Dynamics trasformano il video in questa forma immacolata di cattura della performance permettendovi di diventare creature fantastiche grazie al vostro telefono. Roba che James Cameron poteva solo sognare negli anni 2000. E ora possiamo farla con un iPhone? Mi sembra davvero pazzesco.

So when I look back at the past two decades and this ill-tailored tapestry of tools that I've had to learn, I feel a sense of optimism for what lies ahead for the next generation of creators. The 11-year-olds of today don't have to worry about all of that crap. All they need to do is have a creative vision and a knack for working in concert with these AI models, these AI models that are truly a distillation of human knowledge and creativity. And that's a future I'm excited about, a future where you can blend reality and imagination with your trusty AI copilot.

Quando guardo indietro ai due decenni passati e questo ammasso disordinato di strumenti che ero costretto a usare, mi sento ottimista verso il futuro e i creatori che verranno. Gli undicenni di oggi non hanno a che fare con questo caos. Devono solo avere un’idea creativa ed essere portati per il lavoro insieme a questi modelli di IA, modelli di IA che sono un vero capolavoro della conoscenza e creatività umana. Ed è un futuro eccitante, un futuro in cui possiamo fondere realtà e immaginazione con il nostro affidabile copilota, l’IA.

Thank you very much.

Grazie mille.

(Applause)

(Applausi)

So today, let's deconstruct this quest to blend reality and imagination and explore how it’s getting supercharged -- buzzword alert -- by artificial intelligence. Let's start with the reality bit.

Thank you very much.

Grazie mille.

(Applause)

(Applausi)

Bilawal Sidhu: The AI-powered tools supercharging your imagination

Bilawal Sidhu: The AI-powered tools supercharging your imagination

Related talks

Eileen Isagon Skyers: In the age of AI art, what can originality look like?

Yat Siu: The dream of digital ownership, powered by the metaverse

Ersin Han Ersin: What's it like to be a giant sequoia tree?

Tom Gruber: How AI can enhance our memory, work and social lives

Max Tegmark: How to get empowered, not overpowered, by AI

Kevin Kelly: How AI can bring on a second Industrial Revolution

Related talks

Eileen Isagon Skyers: In the age of AI art, what can originality look like?

Yat Siu: The dream of digital ownership, powered by the metaverse

Ersin Han Ersin: What's it like to be a giant sequoia tree?

Tom Gruber: How AI can enhance our memory, work and social lives

Max Tegmark: How to get empowered, not overpowered, by AI

Kevin Kelly: How AI can bring on a second Industrial Revolution