Sebastian Wernicke: How to use data to make a hit TV show

Roy Price is a man that most of you have probably never heard about, even though he may have been responsible for 22 somewhat mediocre minutes of your life on April 19, 2013. He may have also been responsible for 22 very entertaining minutes, but not very many of you. And all of that goes back to a decision that Roy had to make about three years ago.

Van Roy Price hebben de meesten van jullie waarschijnlijk nog nooit gehoord, hoewel hij verantwoordelijk kan zijn geweest voor 22 middelmatige minuten van je leven, op 19 april 2013. Hij kan ook verantwoordelijk zijn geweest voor 22 amusante minuten, maar niet voor velen van jullie. Dat had alles te maken met een beslissing die Roy zo'n drie jaar geleden moest maken.

So you see, Roy Price is a senior executive with Amazon Studios. That's the TV production company of Amazon. He's 47 years old, slim, spiky hair, describes himself on Twitter as "movies, TV, technology, tacos." And Roy Price has a very responsible job, because it's his responsibility to pick the shows, the original content that Amazon is going to make. And of course that's a highly competitive space. I mean, there are so many TV shows already out there, that Roy can't just choose any show. He has to find shows that are really, really great. So in other words, he has to find shows that are on the very right end of this curve here.

Roy Price is senior manager bij de Amazon Studios, het televisieproductiebedrijf van Amazon. Hij is 47 jaar, slank, heeft stekelhaar en omschrijft zichzelf op Twitter als "film, tv, technologie, taco's". Roy Price heeft een hele verantwoordelijke baan, want hij moet ervoor zorgen de goede shows te kiezen, de originele producties die Amazon gaat uitbrengen. Dat is natuurlijk een zeer concurrerend terrein. Er zijn al zoveel televisie-shows dat Roy niet zomaar een show kan kiezen. Hij moet shows vinden die echt heel goed zijn. Met andere woorden, hij moet shows vinden die aan de goede kant van deze curve zitten.

So this curve here is the rating distribution of about 2,500 TV shows on the website IMDB, and the rating goes from one to 10, and the height here shows you how many shows get that rating. So if your show gets a rating of nine points or higher, that's a winner. Then you have a top two percent show. That's shows like "Breaking Bad," "Game of Thrones," "The Wire," so all of these shows that are addictive, whereafter you've watched a season, your brain is basically like, "Where can I get more of these episodes?" That kind of show. On the left side, just for clarity, here on that end, you have a show called "Toddlers and Tiaras" --

Deze curve toont de waarderingsverdeling van zo'n 2500 tv-shows op de IMDb-website en de waardering loopt van 1 tot 10. De hoogte hier toont hoeveel shows die waardering krijgen. Als je show dus een waardering krijgt van 9 punten of meer, dat is een topper. Dan hoort je show tot de top 2 procent. Dat zijn shows als 'Breaking Bad', 'Game of Thrones' en 'The Wire', dus al die shows die verslavend zijn, shows waarvan je hersenen aan het einde van het seizoen zeggen: "Waar kan ik nog meer afleveringen zien?" Zo'n soort show. Links, voor de duidelijkheid, hier aan dit einde, heb je een show getiteld 'Toddlers and Tiaras' --

(Laughter)

(Gelach)

-- which should tell you enough about what's going on on that end of the curve.

-- dan weet je direct wat er aan dat eind van de curve gebeurt.

Now, Roy Price is not worried about getting on the left end of the curve, because I think you would have to have some serious brainpower to undercut "Toddlers and Tiaras." So what he's worried about is this middle bulge here, the bulge of average TV, you know, those shows that aren't really good or really bad, they don't really get you excited. So he needs to make sure that he's really on the right end of this.

Roy Price maakt zich geen zorgen om links op de curve te komen, omdat je volgens mij heel intelligent moet zijn om onder 'Toddlers and Tiaras' te scoren. Waar hij zich dus zorgen om maakt, is deze bult in het midden, de bult van gemiddelde tv, de shows die niet echt goed of heel slecht zijn. Je raakt er niet echt opgewonden van. Hij moet er dus voor zorgen dat hij aan de rechterkant komt.

So the pressure is on, and of course it's also the first time that Amazon is even doing something like this, so Roy Price does not want to take any chances. He wants to engineer success. He needs a guaranteed success, and so what he does is, he holds a competition.

De druk is dus hoog en het is natuurlijk ook de eerste keer dat Amazon iets dergelijks doet. Roy Price wil dus geen risico lopen. Hij wil succes creëren. Hij wil gegarandeerd succes en wat hij dus doet, is een wedstrijd houden.

So he takes a bunch of ideas for TV shows, and from those ideas, through an evaluation, they select eight candidates for TV shows, and then he just makes the first episode of each one of these shows and puts them online for free for everyone to watch. And so when Amazon is giving out free stuff, you're going to take it, right? So millions of viewers are watching those episodes.

Hij neemt een aantal ideeën voor tv-shows en door evaluatie kiezen ze acht kandidaten voor tv-shows. Dan maakt hij de eerste aflevering van elk van deze shows en zet ze gratis online, zodat iedereen ze kan zien. Als Amazon dingen voor niks weggeeft, dan wil je het hebben ook, toch? Dus zien miljoenen kijkers die afleveringen.

What they don't realize is that, while they're watching their shows, actually, they are being watched. They are being watched by Roy Price and his team, who record everything. They record when somebody presses play, when somebody presses pause, what parts they skip, what parts they watch again. So they collect millions of data points, because they want to have those data points to then decide which show they should make. And sure enough, so they collect all the data, they do all the data crunching, and an answer emerges, and the answer is, "Amazon should do a sitcom about four Republican US Senators." They did that show.

Maar wat ze niet beseffen is dat, terwijl ze zitten te kijken, zij zelf ook bekeken worden. Ze worden bekeken door Roy Price en zijn team, die alles opslaan. Ze slaan op wanneer iemand op 'afspelen' of op 'pauze' drukt, welke stukken ze overslaan, welke worden herhaald. Ze verzamelen miljoenen gegevens, omdat ze die data willen hebben om te kunnen beslissen welke show ze moeten maken. Dus verzamelen ze die data en als die data zijn geanalyseerd rolt daar een antwoord uit, en dat antwoord is: "Amazon moet een serie maken over vier Republikeinse senatoren". En dat deden ze.

So does anyone know the name of the show? (Audience: "Alpha House.") Yes, "Alpha House," but it seems like not too many of you here remember that show, actually, because it didn't turn out that great. It's actually just an average show, actually -- literally, in fact, because the average of this curve here is at 7.4, and "Alpha House" lands at 7.5, so a slightly above average show, but certainly not what Roy Price and his team were aiming for. Meanwhile, however, at about the same time, at another company, another executive did manage to land a top show using data analysis, and his name is Ted, Ted Sarandos, who is the Chief Content Officer of Netflix, and just like Roy, he's on a constant mission to find that great TV show, and he uses data as well to do that, except he does it a little bit differently. So instead of holding a competition, what he did -- and his team of course -- was they looked at all the data they already had about Netflix viewers, you know, the ratings they give their shows, the viewing histories, what shows people like, and so on. And then they use that data to discover all of these little bits and pieces about the audience: what kinds of shows they like, what kind of producers, what kind of actors. And once they had all of these pieces together, they took a leap of faith, and they decided to license not a sitcom about four Senators but a drama series about a single Senator. You guys know the show?

Weet iemand de naam van de show? (Publiek) 'Alpha House'. Ja, 'Alpha House', maar het lijkt erop dat niet veel mensen zich die show herinneren, hij pakte dan ook niet erg goed uit. Eigenlijk is het maar een doorsnee show, letterlijk, omdat het gemiddelde van deze curve hier 7,4 is en 'Alpha House' kwam op 7,5, dus net boven het gemiddelde, zeker niet waar Roy Price en zijn team op mikten. In de tussentijd echter, ongeveer tegelijkertijd, bij een ander bedrijf, slaagde een andere manager erin een topproductie te maken met data-analyse en zijn naam is Ted. Ted Sarandos is bij Netflix hoofd 'eigen productie' en net als Roy is hij op een constante missie die geweldige tv-show te vinden; hij gebruikt daarbij ook data, alleen doet hij het een beetje anders. In plaats van een competitie te houden keken hij -- en zijn team natuurlijk -- naar alle data die ze al hadden over Netflix-kijkers, dus de waarderingscijfers die ze geven, waar ze meestal naar kijken, wat ze leuk vinden, enz. Vervolgens gebruikten ze die data om elk detail over het publiek te ontdekken: welke shows ze leuk vinden, welke producers, welke acteurs. Zodra ze al die details bij elkaar hadden, namen ze een sprong in het diepe en besloten niet te kiezen voor een programma over vier senatoren maar voor een dramaserie over één senator. Kennen jullie die serie?

(Laughter)

(Gelach)

Yes, "House of Cards," and Netflix of course, nailed it with that show, at least for the first two seasons.

Ja, 'House of Cards', en Netflix had een voltreffer met die show, in ieder geval voor de eerste twee seizoenen.

(Laughter) (Applause)

(Gelach) (Applaus)

"House of Cards" gets a 9.1 rating on this curve, so it's exactly where they wanted it to be.

'House of Cards' scoort 9.1 op deze curve, precies waar ze wilden zitten.

Now, the question of course is, what happened here? So you have two very competitive, data-savvy companies. They connect all of these millions of data points, and then it works beautifully for one of them, and it doesn't work for the other one. So why? Because logic kind of tells you that this should be working all the time. I mean, if you're collecting millions of data points on a decision you're going to make, then you should be able to make a pretty good decision. You have 200 years of statistics to rely on. You're amplifying it with very powerful computers. The least you could expect is good TV, right?

De vraag is nu natuurlijk: wat gebeurde hier? Je hebt twee concurrerende data-bewuste bedrijven. Ze koppelen al deze miljoenen gegevens en dan pakt dat heel goed uit voor de een, maar niet voor de ander. Waarom eigenlijk? Het lijkt zo vanzelfsprekend dat dit altijd zou moeten werken. Ik bedoel, als je miljoenen gegevens verzamelt voor een beslissing die je gaat nemen, dan zou je toch een goede beslissing moeten kunnen nemen. Je kunt steunen op 200 jaar statistiek. Je versterkt het nog met hele krachtige computers. Dan mag je toch in ieder geval wel goede televisie verwachten, niet?

And if data analysis does not work that way, then it actually gets a little scary, because we live in a time where we're turning to data more and more to make very serious decisions that go far beyond TV. Does anyone here know the company Multi-Health Systems? No one. OK, that's good actually. OK, so Multi-Health Systems is a software company, and I hope that nobody here in this room ever comes into contact with that software, because if you do, it means you're in prison.

Mocht data-analyse zo niet werken dan wordt het een beetje griezelig, want we leven in een tijd waarin we steeds meer op data afgaan om serieuze beslissingen te nemen die televisie ver te boven gaan. Kent iemand hier het bedrijf Multi-Health Systems? Niemand. OK, gelukkig maar. Multi-Health Systems is een software-bedrijf en ik hoop dat niemand hier in deze zaal ooit in contact komt met die software, want anders betekent het dat je in de gevangenis zit.

(Laughter)

(Gelach)

If someone here in the US is in prison, and they apply for parole, then it's very likely that data analysis software from that company will be used in determining whether to grant that parole. So it's the same principle as Amazon and Netflix, but now instead of deciding whether a TV show is going to be good or bad, you're deciding whether a person is going to be good or bad. And mediocre TV, 22 minutes, that can be pretty bad, but more years in prison, I guess, even worse.

Als iemand in de VS in de gevangenis zit en voorwaardelijk vrij wil komen, dan is het hoogstwaarschijnlijk dat data-analysesoftware van dat bedrijf gebruikt wordt om te bepalen of die aanvraag wordt gehonoreerd. Het is dus hetzelfde principe als bij Amazon en Netflix, maar in plaats van te beslissen of een tv-show het goed zal doen, beslis je nu of een persoon het goed of slecht gaat doen. Middelmatige tv, 22 minuten lang, kan heel slecht zijn, maar meer tijd in de gevangenis zelfs slechter, gok ik.

And unfortunately, there is actually some evidence that this data analysis, despite having lots of data, does not always produce optimum results. And that's not because a company like Multi-Health Systems doesn't know what to do with data. Even the most data-savvy companies get it wrong. Yes, even Google gets it wrong sometimes.

Jammer genoeg zijn er zelfs bewijzen dat deze data-analyse, ondanks de beschikbaarheid van veel data, niet altijd de optimale resultaten geeft. Dat komt niet omdat een bedrijf als Multi-Health Systems niet met data om kan gaan. Zelfs de slimste data-analisten kunnen fout zitten. Ja, zelfs Google zit soms fout.

In 2009, Google announced that they were able, with data analysis, to predict outbreaks of influenza, the nasty kind of flu, by doing data analysis on their Google searches. And it worked beautifully, and it made a big splash in the news, including the pinnacle of scientific success: a publication in the journal "Nature." It worked beautifully for year after year after year, until one year it failed. And nobody could even tell exactly why. It just didn't work that year, and of course that again made big news, including now a retraction of a publication from the journal "Nature." So even the most data-savvy companies, Amazon and Google, they sometimes get it wrong. And despite all those failures, data is moving rapidly into real-life decision-making -- into the workplace, law enforcement, medicine. So we should better make sure that data is helping.

In 2009 berichtte Google dat ze met data-analyse griep-epidemieën konden voorspellen, het vervelende soort griep, door data-analyse los te laten op wat mensen op Google opzochten. Het pakte wonderwel goed uit en het werd een hit in het nieuws, tot en met het toppunt van wetenschappelijk succes: een publicatie in het tijdschrift Nature. Eerst werkte het goed, jaar na jaar na jaar, tot het een bepaald jaar niet werkte. Niemand kon precies zeggen waarom. Het werkte dat jaar gewoon niet en dat haalde weer het nieuws natuurlijk, inclusief het terugtrekken van een publicatie in het tijdschrift Nature. Dus zelfs de grootste data-verzamelaars, Amazon en Google, hebben het soms mis. Ondanks al die mislukkingen neemt het gebruik van data snel toe bij beslissingen in het echte leven: op de werkvloer, bij wetshandhaving, de geneeskunde. We kunnen er dus maar beter voor zorgen dat die data ook helpen.

Now, personally I've seen a lot of this struggle with data myself, because I work in computational genetics, which is also a field where lots of very smart people are using unimaginable amounts of data to make pretty serious decisions like deciding on a cancer therapy or developing a drug. And over the years, I've noticed a sort of pattern or kind of rule, if you will, about the difference between successful decision-making with data and unsuccessful decision-making, and I find this a pattern worth sharing, and it goes something like this.

Zelf heb ik veel gesteggel gezien bij het gebruiken van data, omdat ik werk in de computer-genetica, wat ook een terrein is waar veel slimme mensen ongelooflijke hoeveelheden data gebruiken om serieuze beslissingen te nemen, zoals over een kankertherapie of het ontwikkelen van een medicijn. Door de jaren heb ik een patroon zien ontstaan, of een soort regel als je wilt, over wat het verschil maakt tussen succesvolle beslissingen op basis van data en niet-succesvolle beslissingen. Ik vind dit een patroon dat ik wil delen en het gaat als volgt.

So whenever you're solving a complex problem, you're doing essentially two things. The first one is, you take that problem apart into its bits and pieces so that you can deeply analyze those bits and pieces, and then of course you do the second part. You put all of these bits and pieces back together again to come to your conclusion. And sometimes you have to do it over again, but it's always those two things: taking apart and putting back together again.

Als je een complex probleem wilt oplossen, doe je in wezen twee dingen. Eerst deel je het probleem op in kleine stukjes, zodat je die onderdelen grondig kunt analyseren, en dan zet je natuurlijk de tweede stap: je doet al die stukjes weer bij elkaar en je komt tot een conclusie. Soms moet je het nog een keer doen, maar steeds draait het om twee dingen: uit elkaar halen en terug zetten.

And now the crucial thing is that data and data analysis is only good for the first part. Data and data analysis, no matter how powerful, can only help you taking a problem apart and understanding its pieces. It's not suited to put those pieces back together again and then to come to a conclusion. There's another tool that can do that, and we all have it, and that tool is the brain. If there's one thing a brain is good at, it's taking bits and pieces back together again, even when you have incomplete information, and coming to a good conclusion, especially if it's the brain of an expert.

Het cruciale is dat voor data en data-analyse alleen het eerste deel werkt. Data en data-analyse, hoe indrukwekkend ook, helpen je alleen een probleem te ontrafelen en details te begrijpen. Het is niet geschikt om die stukjes weer bij elkaar te brengen en dan tot een conclusie te komen. Een ander hulpmiddel kan dat wel, en we hebben het allemaal: het brein. Waar het brein echt goed in is, is stukjes weer bij elkaar brengen, zelfs als je onvolledige informatie hebt, en tot een goede conclusie komen, zeker als het het brein van een expert is.

And that's why I believe that Netflix was so successful, because they used data and brains where they belong in the process. They use data to first understand lots of pieces about their audience that they otherwise wouldn't have been able to understand at that depth, but then the decision to take all these bits and pieces and put them back together again and make a show like "House of Cards," that was nowhere in the data. Ted Sarandos and his team made that decision to license that show, which also meant, by the way, that they were taking a pretty big personal risk with that decision. And Amazon, on the other hand, they did it the wrong way around. They used data all the way to drive their decision-making, first when they held their competition of TV ideas, then when they selected "Alpha House" to make as a show. Which of course was a very safe decision for them, because they could always point at the data, saying, "This is what the data tells us." But it didn't lead to the exceptional results that they were hoping for.

Daarom geloof ik dat Netflix zo'n succes had, omdat ze de data en het brein gebruikten waar ze in het proces thuishoren. Ze gebruiken data om eerst veel details van hun publiek te begrijpen, details die ze anders nooit zo grondig hadden kunnen begrijpen. Maar de beslissing om al deze losse stukjes weer bijeen te brengen en een show als House of Cards te maken, dat stond nergens in de data. Ted Sarandos en zijn team namen de beslissing om die serie goed te keuren, waarmee ze trouwens ook een groot persoonlijk risico namen. Bij Amazon deden ze het echter omgekeerd, dus fout. Ze gebruikten steeds data om hun beslissing te sturen, eerst toen ze hun wedstrijd van tv-ideeën hielden, en later toen ze besloten 'Alpha House' te maken. Dat was voor hen een hele veilige beslissing, omdat ze altijd konden wijzen naar de data en zeggen: "Dit zeggen de data ons." Maar het leidde niet tot de uitzonderlijke resultaten die ze hadden gehoopt.

So data is of course a massively useful tool to make better decisions, but I believe that things go wrong when data is starting to drive those decisions. No matter how powerful, data is just a tool, and to keep that in mind, I find this device here quite useful. Many of you will ...

Data zijn dus een enorm nuttig middel om betere beslissingen te nemen, maar ik denk dat dingen fout gaan zodra data die beslissingen beginnen te sturen. Hoe sterk ze ook zijn, data zijn slechts een hulpmiddel en met dat in je achterhoofd vind ik dit ding hier heel nuttig. Velen van jullie zullen .....

(Laughter)

(Gelach)

Before there was data, this was the decision-making device to use.

Voordat er data waren, was dit het hulpmiddel om beslissingen te nemen.

(Laughter)

(Gelach)

Many of you will know this. This toy here is called the Magic 8 Ball, and it's really amazing, because if you have a decision to make, a yes or no question, all you have to do is you shake the ball, and then you get an answer -- "Most Likely" -- right here in this window in real time. I'll have it out later for tech demos.

Velen van jullie kennen het. Dit speeltje hier heet de Magic 8 Ball, heel verbazingwekkend, omdat als je een besluit moet nemen, een ja of nee vraag, je alleen de bal maar hoeft te schudden en dan krijg je het antwoord -- "Hoogstwaarschijnlijk" -- hier in dit venster, hier en nu. Ik laat hem straks rondgaan ter demonstratie.

(Laughter)

(Gelach)

Now, the thing is, of course -- so I've made some decisions in my life where, in hindsight, I should have just listened to the ball. But, you know, of course, if you have the data available, you want to replace this with something much more sophisticated, like data analysis to come to a better decision. But that does not change the basic setup. So the ball may get smarter and smarter and smarter, but I believe it's still on us to make the decisions if we want to achieve something extraordinary, on the right end of the curve. And I find that a very encouraging message, in fact, that even in the face of huge amounts of data, it still pays off to make decisions, to be an expert in what you're doing and take risks. Because in the end, it's not data, it's risks that will land you on the right end of the curve.

Wat ik natuurlijk wil zeggen is dat ik besluiten heb genomen in mijn leven waarbij ik achteraf toch naar de bal had moeten luisteren. Maar als je de data tot je beschikking hebt, wil je zo'n bal door iets veel geavanceerders vervangen, zoals data-analyse, om tot een besluit te komen. Maar fundamenteel verandert er niets. De bal kan slimmer en slimmer en slimmer worden, maar ik denk dat wij nog steeds de beslissingen moeten nemen als we iets buitengewoons willen bereiken aan de rechterkant van de curve. Ik vind dat eigenlijk een hele bemoedigende boodschap, dat het zelfs onder het oog van grote hoeveelheden data loont om besluiten te nemen, om expert te zijn bij wat je doet en om risico's te nemen. Want uiteindelijk zijn het niet de data maar de risico's die je doen belanden aan de rechterzijde van de curve.

Thank you.

Dank je wel.

(Applause)

(Applaus)

(Laughter)

(Gelach)

-- which should tell you enough about what's going on on that end of the curve.

-- dan weet je direct wat er aan dat eind van de curve gebeurt.

(Laughter)

(Gelach)

Yes, "House of Cards," and Netflix of course, nailed it with that show, at least for the first two seasons.

Ja, 'House of Cards', en Netflix had een voltreffer met die show, in ieder geval voor de eerste twee seizoenen.

(Laughter) (Applause)

(Gelach) (Applaus)

"House of Cards" gets a 9.1 rating on this curve, so it's exactly where they wanted it to be.

'House of Cards' scoort 9.1 op deze curve, precies waar ze wilden zitten.

(Laughter)

(Gelach)

(Laughter)

(Gelach)

Before there was data, this was the decision-making device to use.

Voordat er data waren, was dit het hulpmiddel om beslissingen te nemen.

(Laughter)

(Gelach)

(Laughter)

(Gelach)

Thank you.

Dank je wel.

(Applause)

(Applaus)

Sebastian Wernicke: How to use data to make a hit TV show

Sebastian Wernicke: How to use data to make a hit TV show

Related talks

Lauren Zalaznick: The conscience of television

Kevin Slavin: How algorithms shape our world

Susan Etlinger: What do we do with all this big data?

Tricia Wang: The human insights missing from big data

Giorgia Lupi: How we can find ourselves in data

Madhumita Murgia: How data brokers sell your identity

Related talks

Lauren Zalaznick: The conscience of television

Kevin Slavin: How algorithms shape our world

Susan Etlinger: What do we do with all this big data?

Tricia Wang: The human insights missing from big data

Giorgia Lupi: How we can find ourselves in data

Madhumita Murgia: How data brokers sell your identity