Sam Gregory: When AI can fake reality, who can you trust?

It's getting harder, isn't it, to spot real from fake, AI-generated from human-generated. With generative AI, along with other advances in deep fakery, it doesn't take many seconds of your voice, many images of your face, to fake you, and the realism keeps increasing.

E din ce în ce mai greu să identifici ceva real de ceva fals, ceva generat de AI și ceva creat de om. Cu inteligența artificială generativă și alte progrese din domeniul conținutului falsificat, e nevoie de câteva secunde din vocea ta, câteva imagini ale feței tale, pentru a-ți fi furată identitatea, iar precizia e din ce în ce mai mare.

I first started working on deepfakes in 2017, when the threat to our trust in information was overhyped, and the big harm, in reality, was falsified sexual images. Now that problem keeps growing, harming women and girls worldwide. But also, with advances in generative AI, we're now also approaching a world where it's broadly easier to make fake reality, but also to dismiss reality as possibly faked.

Am început să lucrez la deepfakes în 2017, când crescuse amenințarea la adresa încrederii în informații, iar cel mai grav a fost apariția de imagini sexuale falsificate. Această problemă a crescut și afectează femei și fete din întreaga lume. De asemenea, odată cu progresele AI, ne apropiem acum de o lume în care e ușor să creezi o realitate falsă, dar și să respingi realitatea și s-o consideri falsă.

Now, deceptive and malicious audiovisual AI is not the root of our societal problems, but it's likely to contribute to them. Audio clones are proliferating in a range of electoral contexts. "Is it, isn't it" claims cloud human-rights evidence from war zones, sexual deepfakes target women in public and in private, and synthetic avatars impersonate news anchors.

Inteligența artificială audiovizuală înșelătoare și rău intenționată nu e rădăcina problemelor noastre din societate, dar poate contribui la ele. Clonele audio proliferează într-o serie de contexte electorale. Afirmațiile „E adevărat”/„Nu e adevărat” nu aduc dovezi reale din zonele de război, falsificările sexuale vizează femeile în public și în privat, iar avatarurile artificiale se deghizează în prezentatori de știri.

I lead WITNESS. We're a human-rights group that helps people use video and technology to protect and defend their rights. And for the last five years, we've coordinated a global effort, "Prepare, Don't Panic," around these new ways to manipulate and synthesize reality, and on how to fortify the truth of critical frontline journalists and human-rights defenders.

Sunt director la WITNESS, un grup pentru drepturile omului, care ajută oamenii să-și apere drepturile, folosind tehnologia video. În ultimii cinci ani, am coordonat un efort global, „Pregătiți-vă, nu vă panicați”, în jurul noilor modalități de manipulare și sintetizare a realității și a modului de a consolida adevărul, oferit de jurnaliștii din prima linie și apărătorii drepturilor omului.

Now, one element in that is a deepfakes rapid-response task force, made up of media-forensics experts and companies who donate their time and skills to debunk deepfakes and claims of deepfakes. The task force recently received three audio clips, from Sudan, West Africa and India. People were claiming that the clips were deepfaked, not real. In the Sudan case, experts used a machine-learning algorithm trained on over a million examples of synthetic speech to prove, almost without a shadow of a doubt, that it was authentic. In the West Africa case, they couldn't reach a definitive conclusion because of the challenges of analyzing audio from Twitter, and with background noise.

O componentă a lui e grupul de lucru care verifică rapid falsurile, format din experți în media criminalistică și companii care își donează timpul și abilitățile pentru a dezvălui falsurile atunci când apar. Grupul de lucru a primit recent trei clipuri audio, din Sudan, Africa de Vest și India. Oamenii susțineau că nu erau reale, ci falsificate. În cazul Sudanului, experții au folosit un algoritm de învățare automată, testat în peste un milion de exemple de vorbire contrafăcută, pentru a dovedi, aproape fără umbră de îndoială, că este autentic. În cazul Africii de Vest, nu s-a putut ajunge la o concluzie definitivă, din cauza problemelor la analiza audio de pe Twitter și a zgomotului de fundal.

The third clip was leaked audio of a politician from India. Nilesh Christopher of “Rest of World” brought the case to the task force. The experts used almost an hour of samples to develop a personalized model of the politician's authentic voice. Despite his loud and fast claims that it was all falsified with AI, experts concluded that it at least was partially real, not AI. As you can see, even experts cannot rapidly and conclusively separate true from false, and the ease of calling "that's deepfaked" on something real is increasing.

Al treilea clip a fost difuzat audio și era al unui politician din India. Nilesh Christopher, de la „Rest of World”, a prezentat cazul grupului de lucru. Experții au folosit o oră de eșantioane pentru a dezvolta un model personalizat al vocii reale a politicianului. În ciuda afirmațiilor sale că totul a fost falsificat cu AI, experții au ajuns la concluzia că era parțial real, nu doar AI. După cum puteți vedea, nici experții nu pot separa rapid și concludent adevărul de fals, iar ușurința de a cataloga drept deepfake ceva real este în creștere.

The future is full of profound challenges, both in protecting the real and detecting the fake. We're already seeing the warning signs of this challenge of discerning fact from fiction. Audio and video deepfakes have targeted politicians, major political leaders in the EU, Turkey and Mexico, and US mayoral candidates. Political ads are incorporating footage of events that never happened, and people are sharing AI-generated imagery from crisis zones, claiming it to be real.

Viitorul e plin de provocări profunde, atât în protejarea realului, cât și în detectarea falsului. Vedem deja semnele de avertizare ale acestei provocări de a discerne realul de ficțiune. Deepfake-urile audio și video au vizat politicieni, lideri politici din UE, Turcia și Mexic și candidați pentru primăriile din SUA. Reclamele politice conțin imagini cu evenimente care nu s-au întâmplat, iar oamenii distribuie imagini generate de AI din zonele de criză, susținând că sunt reale.

Now, again, this problem is not entirely new. The human-rights defenders and journalists I work with are used to having their stories dismissed, and they're used to widespread, deceptive, shallow fakes, videos and images taken from one context or time or place and claimed as if they're in another, used to share confusion and spread disinformation. And of course, we live in a world that is full of partisanship and plentiful confirmation bias.

Această problemă nu e deloc nouă. Apărătorii drepturilor omului și jurnaliștii cu care lucrez sunt obișnuiți să li se respingă reportajele și cunosc falsurile răspândite, înșelătoare, superficiale, videoclipuri și imagini luate din context, timp sau loc și prezentate ca și cum s-ar fi întâmplat în altă parte, folosite pentru a stârni confuzie și a răspândi dezinformare. Trăim într-o lume plină de partizanat și prejudecăți de tot felul.

Given all that, the last thing we need is a diminishing baseline of the shared, trustworthy information upon which democracies thrive, where the specter of AI is used to plausibly believe things you want to believe, and plausibly deny things you want to ignore.

De aceea, ultimul lucru de care avem nevoie e o bază din ce în ce mai mică de informații de încredere pe care prosperă democrațiile, unde spectrul AI e folosit pentru a crede în mod plauzibil lucruri pe care dorim să le credem și pentru a nega lucrurile pe care dorim să le ignorăm.

But I think there's a way we can prevent that future, if we act now; that if we "Prepare, Don't Panic," we'll kind of make our way through this somehow. Panic won't serve us well. [It] plays into the hands of governments and corporations who will abuse our fears, and into the hands of people who want a fog of confusion and will use AI as an excuse.

Dar există o modalitate prin care putem preveni acel viitor, dacă acționăm acum; dacă ne „Pregătim, nu intrăm în panică”, vom reuși să trecem prin asta. Panica nu aduce nimic bun. [Ea] e folosită de guverne și corporații, care abuzează de fricile noastre, și de cei care vor să provoace confuzie și folosesc AI ca pretext.

How many people were taken in, just for a minute, by the Pope in his dripped-out puffer jacket? You can admit it.

Câți au fost păcăliți, fie și doar pentru un minut, de jacheta Papei? Recunoașteti.

(Laughter)

(Râsete)

More seriously, how many of you know someone who's been scammed by an audio that sounds like their kid? And for those of you who are thinking "I wasn't taken in, I know how to spot a deepfake," any tip you know now is already outdated. Deepfakes didn't blink, they do now. Six-fingered hands were more common in deepfake land than real life -- not so much. Technical advances erase those visible and audible clues that we so desperately want to hang on to as proof we can discern real from fake.

Mai grav decât atât, câți dintre voi cunoașteți pe cineva care a fost înșelat de o voce care seamănă cu a copilului lor? Iar pentru aceia dintre voi care spun „N-am fost păcălit, știu cum să depistez un deepfake”, orice recomandare pe care o știți nu mai e de actualitate. O imagine falsă nu clipea, dar acum clipește. Vedeam mâini cu șase degete mai mult în deepfake decât în viața reală, acum nu prea mai există. Progresele tehnice șterg indiciile audio și video pe care ne bazăm pentru a deosebi realul de fals.

But it also really shouldn’t be on us to make that guess without any help. Between real deepfakes and claimed deepfakes, we need big-picture, structural solutions. We need robust foundations that enable us to discern authentic from simulated, tools to fortify the credibility of critical voices and images, and powerful detection technology that doesn't raise more doubts than it fixes.

Dar n-ar trebui să facem noi această deosebire. Pentru falsurile existente și cele revendicate avem nevoie de soluții de ansamblu. Avem nevoie de elemente robuste care să ne permită să discernem autenticul de artificial, instrumente pentru a mări credibilitatea vocilor și imaginilor importante și o tehnologie puternică ce nu ridică mai multe îndoieli decât rezolvă.

There are three steps we need to take to get to that future. Step one is to ensure that the detection skills and tools are in the hands of the people who need them. I've talked to hundreds of journalists, community leaders and human-rights defenders, and they're in the same boat as you and me and us. They're listening to the audio, trying to think, "Can I spot a glitch?" Looking at the image, saying, "Oh, does that look right or not?" Or maybe they're going online to find a detector. And the detector they find, they don't know whether they're getting a false positive, a false negative, or a reliable result.

Există trei etape pentru a ajunge la acest punct. Prima e să vă asigurați că instrumentele de detectare sunt în mâinile persoanelor care au nevoie de ele. Am vorbit cu sute de jurnaliști, lideri din comunității, apărători ai drepturilor omului, și toți sunt de aceeași părere. Ascultă vocea și se gândesc dacă pot observa un indiciu. Privesc imaginea și se gândesc dacă e reală sau nu. Pot căuta online un anumit detector. Cu acest detector nu știu dacă rezultatul obținut va fi fals pozitiv, fals negativ sau unul adevărat.

Here's an example. I used a detector, which got the Pope in the puffer jacket right. But then, when I put in the Easter bunny image that I made for my kids, it said that it was human-generated. This is because of some big challenges in deepfake detection. Detection tools often only work on one single way to make a deepfake, so you need multiple tools, and they don't work well on low-quality social media content. Confidence score, 0.76-0.87, how do you know whether that's reliable, if you don't know if the underlying technology is reliable, or whether it works on the manipulation that is being used? And tools to spot an AI manipulation don't spot a manual edit.

Iată un exemplu. Am folosit un detector pentru Papa purtând acea jachetă. Apoi am pus și iepurașul de Paști, pe care l-am generat pentru copiii mei, și a spus că a fost o imagine generată de om. Asta pentru că există mari probleme în detectarea falsurilor. Instrumentele detectează adesea doar un singur mod de a crea un deepfake, deci e nevoie de mai multe instrumente și ele nu funcționează bine în conținutul rețelelor sociale de calitate scăzută. Scorul de încredere e 0,76-0,87, așadar nu știi dacă e real, dacă tehnologia de bază e fiabilă sau dacă funcționează în scopul pentru care e utilizată. Iar instrumentele de identificare în cazul manipulării AI nu detectează editarea manuală.

These tools also won't be available to everyone. There's a trade-off between security and access, which means if we make them available to anyone, they become useless to everybody, because the people designing the new deception techniques will test them on the publicly available detectors and evade them. But we do need to make sure these are available to the journalists, the community leaders, the election officials, globally, who are our first line of defense, thought through with attention to real-world accessibility and use. Though at the best circumstances, detection tools will be 85 to 95 percent effective, they have to be in the hands of that first line of defense, and they're not, right now.

În plus, aceste instrumente nu sunt la îndemâna oricui. E un compromis între securitate și acces, ceea ce înseamnă că dacă are acces oricine la ele, ele devin inutile la scară largă, deoarece cei care lucrează la noile tehnici de înșelătorie le vor testa pe detectoarele disponibile și astfel nu vor fi detectate. Dar trebuie să ne asigurăm că acestea sunt disponibile jurnaliștilor, liderilor din comunități, oficialilor electorali la nivel global, care sunt prima noastră linie de apărare, gândite minuțios ca să fie accesibile și utilizate în lumea reală. În cele mai bune circumstanțe, eficacitatea instrumentelor de detectare e de 85%-95%, de aceea trebuie să fie în mâinile celor din linia întâi, iar acum nu prea sunt.

So for step one, I've been talking about detection after the fact. Step two -- AI is going to be everywhere in our communication, creating, changing, editing. It's not going to be a simple binary of "yes, it's AI" or "phew, it's not." AI is part of all of our communication, so we need to better understand the recipe of what we're consuming.

Ca prim pas, am vorbit despre detectare după ce evenimentul a avut loc. Pasul doi - AI va fi prezentă peste tot în comunicarea noastră, creând, schimbând, editând. Nu va fi o simplă combinație de două elemente: AI sau non-AI. AI e parte din toată comunicarea noastră, așa că trebuie să înțelegem mai bine ce consumăm.

Some people call this content provenance and disclosure. Technologists have been building ways to add invisible watermarking to AI-generated media. They've also been designing ways -- and I've been part of these efforts -- within a standard called the C2PA, to add cryptographically signed metadata to files. This means data that provides details about the content, cryptographically signed in a way that reinforces our trust in that information. It's an updating record of how AI was used to create or edit it, where humans and other technologies were involved, and how it was distributed. It's basically a recipe and serving instructions for the mix of AI and human that's in what you're seeing and hearing. And it's a critical part of a new AI-infused media literacy.

Unii oameni numesc acest conținut proveniență și dezvăluire. Specialiștii în tehnologie au construit moduri de a adăuga elemente invizibile la comunicările generate de AI. De asemenea, au venit cu idei, și am făcut parte din aceste eforturi, în cadrul unui standard numit C2PA, pentru a adăuga metadate semnate criptografic în fișiere. Asta înseamnă date care oferă detalii despre conținut, semnate criptografic într-un mod care ne consolidează încrederea în acele informații. E o înregistrare actualizată a modului în care AI a fost utilizată pentru a o crea sau edita, unde au fost implicați oamenii și alte tehnologii și modul în care a fost distribuită. Practic, e o rețetă și instrucțiunile de servire pentru amestecul de AI și om, pe care o regăsim în ceea ce vedem și auzim. Face parte din noua alfabetizare media infuzate de AI.

And this actually shouldn't sound that crazy. Our communication is moving in this direction already. If you're like me -- you can admit it -- you browse your TikTok “For You” page, and you're used to seeing videos that have an audio source, an AI filter, a green screen, a background, a stitch with another edit. This, in some sense, is the alpha version of this transparency in some of the major platforms we use today. It's just that it does not yet travel across the internet, it’s not reliable, updatable, and it’s not secure.

N-ar trebui să pară ceva extraordinar. Comunicarea noastră se îndreaptă deja în această direcție. Dacă faceți ca mine, mergeți pe pagina „For You” de pe TikTok, unde vedeți de obicei videoclipuri care au o sursă audio, un filtru AI, un ecran verde, un fundal, un punct cu altă editare. Într-un fel, asta e versiunea alfa de transparență, pe unele dintre platformele majore pe care le folosim astăzi. Doar că încă nu e prezentă pe internet, nu e fiabilă, actualizabilă și sigură.

Now, there are also big challenges in this type of infrastructure for authenticity. As we create these durable signs of how AI and human were mixed, that carry across the trajectory of how media is made, we need to ensure they don't compromise privacy or backfire globally. We have to get this right.

De asemenea, există mari provocări în acest tip de infrastructură în ce privește autenticitatea. Când creăm aceste semne durabile în care omul se amestecă cu AI, care arată cum funcționează mass-media, trebuie să ne asigurăm că nu compromit confidențialitatea și nici nu provoacă la nivel global. Trebuie să facem acest lucru corect.

We can't oblige a citizen journalist filming in a repressive context or a satirical maker using novel gen-AI tools to parody the powerful ... to have to disclose their identity or personally identifiable information in order to use their camera or ChatGPT. Because it's important they be able to retain their ability to have anonymity, at the same time as the tool to create is transparent. This needs to be about the how of AI-human media making, not the who.

Nu putem obliga un jurnalist care filmează într-un context represiv sau un creator de conținut comic care folosește instrumente noi de AI pentru a parodia puterea, să fie nevoiți să-și dezvăluie identitatea când își folosesc camera sau ChatGPT. Pentru că e important să-și poată păstra capacitatea de a beneficia de anonimat, iar instrumentul de creație să fie transparent. E vorba de cum creăm media om-AI, nu de cine o creează.

This brings me to the final step. None of this works without a pipeline of responsibility that runs from the foundation models and the open-source projects through to the way that is deployed into systems, APIs and apps, to the platforms where we consume media and communicate.

Așa ajungem la ultima etapă. Nimic din toate astea nu funcționează fără o cale de responsabilitate ce pornește de la modelele fundamentale și proiectele open-source, de la modul în care sunt implementate în sisteme, API-uri și aplicații, până la platformele unde accesăm media și comunicăm.

I've spent much of the last 15 years fighting, essentially, a rearguard action, like so many of my colleagues in the human rights world, against the failures of social media. We can't make those mistakes again in this next generation of technology. What this means is that governments need to ensure that within this pipeline of responsibility for AI, there is transparency, accountability and liability.

Am petrecut mare parte din ultimii 15 ani luptându-mă cu un inamic secundar, ca mulți dintre colegii mei care apără drepturile omului, împotriva eșecurilor rețelelor sociale. Nu putem face aceste greșeli din nou în această nouă generație de tehnologie. Asta înseamnă că guvernele trebuie să se asigure că în calea de responsabilitate a AI există transparență, responsabilitate și răspundere.

Without these three steps -- detection for the people who need it most, provenance that is rights-respecting and that pipeline of responsibility, we're going to get stuck looking in vain for the six-fingered hand, or the eyes that don't blink. We need to take these steps. Otherwise, we risk a world where it gets easier and easier to both fake reality and dismiss reality as potentially faked.

Fără aceste trei etape, detectarea persoanelor care au cea mai mare nevoie, proveniența care respectă drepturile și acea cale de responsabilitate, vom căuta în zadar mâna cu șase degete sau ochii care nu clipesc. Trebuie să facem acești pași. În caz contrar, riscăm o lume în care devine din ce în ce mai ușor să falsificăm realitatea și să respingem realitatea ca fiind falsă.

And that is a world that the political philosopher Hannah Arendt described in these terms: "A people that no longer can believe anything cannot make up its own mind. It is deprived not only of its capacity to act but also of its capacity to think and to judge. And with such a people you can then do what you please." That's a world I know none of us want, that I think we can prevent.

O lume pe care filosoful politic Hannah Arendt a descris-o în acești termeni: „Un popor care nu mai poate crede nimic nu poate lua hotărâri singur. E lipsit nu numai de capacitatea de a acționa, ci și de capacitatea de a gândi și de a judeca. Și cu un astfel de popor poți face orice.” E o lume pe care știu că nu o doriți și care cred că poate fi prevenită.

Thanks.

Mulțumesc.

(Cheers and applause)

(Urale şi aplauze)

How many people were taken in, just for a minute, by the Pope in his dripped-out puffer jacket? You can admit it.

Câți au fost păcăliți, fie și doar pentru un minut, de jacheta Papei? Recunoașteti.

(Laughter)

(Râsete)

Thanks.

Mulțumesc.

(Cheers and applause)

(Urale şi aplauze)

Sam Gregory: When AI can fake reality, who can you trust?

Sam Gregory: When AI can fake reality, who can you trust?

Related talks

Danielle Citron: How deepfakes undermine truth and threaten democracy

Tom Graham: The incredible creativity of deepfakes — and the worrying future of AI

Gary Marcus: The urgent risks of runaway AI — and what to do about them

Ivan Krastev: Can democracy exist without trust?

George Papandreou: Imagine a European democracy without borders

Rory Stewart: Why democracy matters

Related talks

Danielle Citron: How deepfakes undermine truth and threaten democracy

Tom Graham: The incredible creativity of deepfakes — and the worrying future of AI

Gary Marcus: The urgent risks of runaway AI — and what to do about them

Ivan Krastev: Can democracy exist without trust?

George Papandreou: Imagine a European democracy without borders

Rory Stewart: Why democracy matters