Sinan Aral: How we can protect truth in the age of misinformation

So, on April 23 of 2013, the Associated Press put out the following tweet on Twitter. It said, "Breaking news: Two explosions at the White House and Barack Obama has been injured." This tweet was retweeted 4,000 times in less than five minutes, and it went viral thereafter.

Now, this tweet wasn't real news put out by the Associated Press. In fact it was false news, or fake news, that was propagated by Syrian hackers that had infiltrated the Associated Press Twitter handle. Their purpose was to disrupt society, but they disrupted much more. Because automated trading algorithms immediately seized on the sentiment on this tweet, and began trading based on the potential that the president of the United States had been injured or killed in this explosion. And as they started tweeting, they immediately sent the stock market crashing, wiping out 140 billion dollars in equity value in a single day.

Robert Mueller, special counsel prosecutor in the United States, issued indictments against three Russian companies and 13 Russian individuals on a conspiracy to defraud the United States by meddling in the 2016 presidential election. And what this indictment tells as a story is the story of the Internet Research Agency, the shadowy arm of the Kremlin on social media. During the presidential election alone, the Internet Agency's efforts reached 126 million people on Facebook in the United States, issued three million individual tweets and 43 hours' worth of YouTube content. All of which was fake -- misinformation designed to sow discord in the US presidential election.

A recent study by Oxford University showed that in the recent Swedish elections, one third of all of the information spreading on social media about the election was fake or misinformation.

In addition, these types of social-media misinformation campaigns can spread what has been called "genocidal propaganda," for instance against the Rohingya in Burma, triggering mob killings in India.

We studied fake news and began studying it before it was a popular term. And we recently published the largest-ever longitudinal study of the spread of fake news online on the cover of "Science" in March of this year. We studied all of the verified true and false news stories that ever spread on Twitter, from its inception in 2006 to 2017. And when we studied this information, we studied verified news stories that were verified by six independent fact-checking organizations. So we knew which stories were true and which stories were false. We can measure their diffusion, the speed of their diffusion, the depth and breadth of their diffusion, how many people become entangled in this information cascade and so on. And what we did in this paper was we compared the spread of true news to the spread of false news. And here's what we found.

We found that false news diffused further, faster, deeper and more broadly than the truth in every category of information that we studied, sometimes by an order of magnitude. And in fact, false political news was the most viral. It diffused further, faster, deeper and more broadly than any other type of false news. When we saw this, we were at once worried but also curious. Why? Why does false news travel so much further, faster, deeper and more broadly than the truth?

The first hypothesis that we came up with was, "Well, maybe people who spread false news have more followers or follow more people, or tweet more often, or maybe they're more often 'verified' users of Twitter, with more credibility, or maybe they've been on Twitter longer." So we checked each one of these in turn. And what we found was exactly the opposite. False-news spreaders had fewer followers, followed fewer people, were less active, less often "verified" and had been on Twitter for a shorter period of time. And yet, false news was 70 percent more likely to be retweeted than the truth, controlling for all of these and many other factors.

So we had to come up with other explanations. And we devised what we called a "novelty hypothesis." So if you read the literature, it is well known that human attention is drawn to novelty, things that are new in the environment. And if you read the sociology literature, you know that we like to share novel information. It makes us seem like we have access to inside information, and we gain in status by spreading this kind of information.

So what we did was we measured the novelty of an incoming true or false tweet, compared to the corpus of what that individual had seen in the 60 days prior on Twitter. But that wasn't enough, because we thought to ourselves, "Well, maybe false news is more novel in an information-theoretic sense, but maybe people don't perceive it as more novel."

So to understand people's perceptions of false news, we looked at the information and the sentiment contained in the replies to true and false tweets. And what we found was that across a bunch of different measures of sentiment -- surprise, disgust, fear, sadness, anticipation, joy and trust -- false news exhibited significantly more surprise and disgust in the replies to false tweets. And true news exhibited significantly more anticipation, joy and trust in reply to true tweets. The surprise corroborates our novelty hypothesis. This is new and surprising, and so we're more likely to share it.

At the same time, there was congressional testimony in front of both houses of Congress in the United States, looking at the role of bots in the spread of misinformation. So we looked at this too -- we used multiple sophisticated bot-detection algorithms to find the bots in our data and to pull them out. So we pulled them out, we put them back in and we compared what happens to our measurement. And what we found was that, yes indeed, bots were accelerating the spread of false news online, but they were accelerating the spread of true news at approximately the same rate. Which means bots are not responsible for the differential diffusion of truth and falsity online. We can't abdicate that responsibility, because we, humans, are responsible for that spread.

Now, everything that I have told you so far, unfortunately for all of us, is the good news.

The reason is because it's about to get a whole lot worse. And two specific technologies are going to make it worse. We are going to see the rise of a tremendous wave of synthetic media. Fake video, fake audio that is very convincing to the human eye. And this will powered by two technologies.

The first of these is known as "generative adversarial networks." This is a machine-learning model with two networks: a discriminator, whose job it is to determine whether something is true or false, and a generator, whose job it is to generate synthetic media. So the synthetic generator generates synthetic video or audio, and the discriminator tries to tell, "Is this real or is this fake?" And in fact, it is the job of the generator to maximize the likelihood that it will fool the discriminator into thinking the synthetic video and audio that it is creating is actually true. Imagine a machine in a hyperloop, trying to get better and better at fooling us.

This, combined with the second technology, which is essentially the democratization of artificial intelligence to the people, the ability for anyone, without any background in artificial intelligence or machine learning, to deploy these kinds of algorithms to generate synthetic media makes it ultimately so much easier to create videos.

The White House issued a false, doctored video of a journalist interacting with an intern who was trying to take his microphone. They removed frames from this video in order to make his actions seem more punchy. And when videographers and stuntmen and women were interviewed about this type of technique, they said, "Yes, we use this in the movies all the time to make our punches and kicks look more choppy and more aggressive." They then put out this video and partly used it as justification to revoke Jim Acosta, the reporter's, press pass from the White House. And CNN had to sue to have that press pass reinstated.

There are about five different paths that I can think of that we can follow to try and address some of these very difficult problems today. Each one of them has promise, but each one of them has its own challenges. The first one is labeling. Think about it this way: when you go to the grocery store to buy food to consume, it's extensively labeled. You know how many calories it has, how much fat it contains -- and yet when we consume information, we have no labels whatsoever. What is contained in this information? Is the source credible? Where is this information gathered from? We have none of that information when we are consuming information. That is a potential avenue, but it comes with its challenges. For instance, who gets to decide, in society, what's true and what's false? Is it the governments? Is it Facebook? Is it an independent consortium of fact-checkers? And who's checking the fact-checkers?

Another potential avenue is incentives. We know that during the US presidential election there was a wave of misinformation that came from Macedonia that didn't have any political motive but instead had an economic motive. And this economic motive existed, because false news travels so much farther, faster and more deeply than the truth, and you can earn advertising dollars as you garner eyeballs and attention with this type of information. But if we can depress the spread of this information, perhaps it would reduce the economic incentive to produce it at all in the first place.

Third, we can think about regulation, and certainly, we should think about this option. In the United States, currently, we are exploring what might happen if Facebook and others are regulated. While we should consider things like regulating political speech, labeling the fact that it's political speech, making sure foreign actors can't fund political speech, it also has its own dangers. For instance, Malaysia just instituted a six-year prison sentence for anyone found spreading misinformation. And in authoritarian regimes, these kinds of policies can be used to suppress minority opinions and to continue to extend repression.

The fourth possible option is transparency. We want to know how do Facebook's algorithms work. How does the data combine with the algorithms to produce the outcomes that we see? We want them to open the kimono and show us exactly the inner workings of how Facebook is working. And if we want to know social media's effect on society, we need scientists, researchers and others to have access to this kind of information. But at the same time, we are asking Facebook to lock everything down, to keep all of the data secure.

So, Facebook and the other social media platforms are facing what I call a transparency paradox. We are asking them, at the same time, to be open and transparent and, simultaneously secure. This is a very difficult needle to thread, but they will need to thread this needle if we are to achieve the promise of social technologies while avoiding their peril.

The final thing that we could think about is algorithms and machine learning. Technology devised to root out and understand fake news, how it spreads, and to try and dampen its flow. Humans have to be in the loop of this technology, because we can never escape that underlying any technological solution or approach is a fundamental ethical and philosophical question about how do we define truth and falsity, to whom do we give the power to define truth and falsity and which opinions are legitimate, which type of speech should be allowed and so on. Technology is not a solution for that. Ethics and philosophy is a solution for that.

Nearly every theory of human decision making, human cooperation and human coordination has some sense of the truth at its core. But with the rise of fake news, the rise of fake video, the rise of fake audio, we are teetering on the brink of the end of reality, where we cannot tell what is real from what is fake. And that's potentially incredibly dangerous.

We have to be vigilant in defending the truth against misinformation. With our technologies, with our policies and, perhaps most importantly, with our own individual responsibilities, decisions, behaviors and actions.

Thank you very much.

(Applause)

A recent study by Oxford University showed that in the recent Swedish elections, one third of all of the information spreading on social media about the election was fake or misinformation.

In addition, these types of social-media misinformation campaigns can spread what has been called "genocidal propaganda," for instance against the Rohingya in Burma, triggering mob killings in India.

Now, everything that I have told you so far, unfortunately for all of us, is the good news.

Thank you very much.

(Applause)

Sinan Aral: How we can protect truth in the age of misinformation

Sinan Aral: How we can protect truth in the age of misinformation

Related talks

Carole Cadwalladr: Facebook's role in Brexit -- and the threat to democracy

Alex Edmans: What to trust in a "post-truth" world

Scott Galloway: How Amazon, Apple, Facebook and Google manipulate our emotions

Olga Yurkova: Inside the fight against Russia's fake news empire

Alisa Miller: How the news distorts our worldview

Gary Flake: Is Pivot a turning point for web exploration?

Related talks

Carole Cadwalladr: Facebook's role in Brexit -- and the threat to democracy

Alex Edmans: What to trust in a "post-truth" world

Scott Galloway: How Amazon, Apple, Facebook and Google manipulate our emotions

Olga Yurkova: Inside the fight against Russia's fake news empire

Alisa Miller: How the news distorts our worldview

Gary Flake: Is Pivot a turning point for web exploration?