Tricia Wang: The human insights missing from big data

In ancient Greece, when anyone from slaves to soldiers, poets and politicians, needed to make a big decision on life's most important questions, like, "Should I get married?" or "Should we embark on this voyage?" or "Should our army advance into this territory?" they all consulted the oracle.

고대 그리스에서는 노예부터 군인까지, 시인과 정치인 모두 다 인생의 가장 중요한 질문들에 대해 큰 결정을 내려야 했어요. 예를 들어, "결혼을 해야 할까?" "항해를 떠나야 할까?" "이 지역까지 군대를 확장시켜야 할까?" 같은 질문들이죠. 그들은 모두 오라클과 상의했어요.

So this is how it worked: you would bring her a question and you would get on your knees, and then she would go into this trance. It would take a couple of days, and then eventually she would come out of it, giving you her predictions as your answer.

이것이 그 과정입니다. 당신은 그녀에게 질문을 가져와요. 그리고 무릎을 꿇습니다. 그러면 그녀는 가수상태에 빠집니다. 이틀 정도 걸려요. 그리고 마침내 그녀는 가수상태에서 깨어나고 당신의 질문에 대한 예언을 합니다.

From the oracle bones of ancient China to ancient Greece to Mayan calendars, people have craved for prophecy in order to find out what's going to happen next. And that's because we all want to make the right decision. We don't want to miss something. The future is scary, so it's much nicer knowing that we can make a decision with some assurance of the outcome.

고대 중국의 신탁 뼈부터 고대 그리스, 마야 달력까지 사람들은 예언을 갈망해 왔어요. 후에 무슨 일이 일어날지 알아내기 위해서 였죠. 이것은 우리 모두가 옳은 선택을 하고 싶어하기 때문입니다. 우리는 어떤 것도 놓치고 싶어하지 않아요. 미래는 무서워요. 그래서 결과에 대한 확신을 가지고 결정을 내릴 수 있다면 훨씬 좋겠지요.

Well, we have a new oracle, and it's name is big data, or we call it "Watson" or "deep learning" or "neural net." And these are the kinds of questions we ask of our oracle now, like, "What's the most efficient way to ship these phones from China to Sweden?" Or, "What are the odds of my child being born with a genetic disorder?" Or, "What are the sales volume we can predict for this product?"

우리에겐 새로운 오라클이 있어요. 빅 데이터라고 불리죠. "왓슨" "딥 러닝" 아니면 "뉴럴네트"라고 부르기도 해요. 그리고 우리는 이 오라클에게 이런 질문들을 합니다. "이 핸드폰들을 중국에서 스웨덴까지 배송하기 위한 가장 효율적인 방법은 무엇일까?" 아니면, "내 아이가 유전질환을 가지고 태어날 확률은 얼마나 될까?" 아니면, "이 상품이 얼마나 많은 매상을 올릴 수 있을까?"

I have a dog. Her name is Elle, and she hates the rain. And I have tried everything to untrain her. But because I have failed at this, I also have to consult an oracle, called Dark Sky, every time before we go on a walk, for very accurate weather predictions in the next 10 minutes. She's so sweet. So because of all of this, our oracle is a $122 billion industry.

저는 강아지 한마리를 키워요. 이름은 엘이구요. 비를 싫어해요. 이것을 위해 모든 노력을 해보았어요. 그러나 저는 그것에 실패했기 때문에 산책을 나갈 때마다, 다크 스카이라고 불리우는 오라클과 상의해야 합니다. 앞으로 10분 간의 정확한 날씨예보를 얻기 위해서지요. 정말 귀여워요. 이런 점들 때문에,이 오라클은 1220억 달러 산업입니다.

Now, despite the size of this industry, the returns are surprisingly low. Investing in big data is easy, but using it is hard. Over 73 percent of big data projects aren't even profitable, and I have executives coming up to me saying, "We're experiencing the same thing. We invested in some big data system, and our employees aren't making better decisions. And they're certainly not coming up with more breakthrough ideas."

이런 규모의 산업에도 불구하고 수익은 놀랍게도 적어요. 빅 데이터에 투자하는 것은 쉬워요. 그러나 이용하는 것은 어렵습니다. 73% 이상의 빅 데이터 프로젝트들이 이윤 조차도 못 남기고 있어요. 그리고 경영진들은 제게 이렇게 이야기 하죠. "우리도 같은 경험을 하고 있어요. 빅 데이터 시스템에 투자했는데 우리 직원들은 더 나은 결정들을 내리지 못하고 있어요. 그리고 획기적인 아이디어도 확실히 못 떠올리고 있고요."

So this is all really interesting to me, because I'm a technology ethnographer. I study and I advise companies on the patterns of how people use technology, and one of my interest areas is data. So why is having more data not helping us make better decisions, especially for companies who have all these resources to invest in these big data systems? Why isn't it getting any easier for them?

제게는 이 모든 것들이 정말 흥미로웠어요. 왜냐하면 저는 기술 민족지학자이기 때문입니다. 저는 사람들이 기술을 어떻게 이용하는지에 대한 패턴을 연구하고 기업들에게 조언합니다. 그리고 저의 관심분야 중 하나가 데이터입니다. 그러면 왜 더 많은 데이터가 더 나은 결정에 도움이 되지 않을까요? 특히, 빅 데이터 시스템에 투자할 수 있는 모든 자료가 있는 기업들 까지도요. 왜 그들의 결정이 쉬워지지 않는 것일까요?

So, I've witnessed the struggle firsthand. In 2009, I started a research position with Nokia. And at the time, Nokia was one of the largest cell phone companies in the world, dominating emerging markets like China, Mexico and India -- all places where I had done a lot of research on how low-income people use technology. And I spent a lot of extra time in China getting to know the informal economy. So I did things like working as a street vendor selling dumplings to construction workers. Or I did fieldwork, spending nights and days in internet cafés, hanging out with Chinese youth, so I could understand how they were using games and mobile phones and using it between moving from the rural areas to the cities.

저는 그 어려움을 직접 목격했습니다. 2009년, 저는 노키아에서 연구원으로 일하기 시작했어요. 그리고 그 때 노키아는 세계에서 가장 큰 핸드폰 회사 중 하나였어요. 중국, 멕시코, 인도 같이 신흥 시장을 장악하고 있었고 저는 그 곳들에서 저임금 사람들이 어떻게 기술을 이용하는지에 대한 많은 연구를 해왔어요. 그리고 저는 많은 시간을 비공식적인 경제에 대해 알기 위해 중국에서 보냈어요. 그래서 저는 길거리 노점상인이 되어보기도 하고 건설 노동자들에게 만두를 팔아보기도 했어요. 아니면 현장조사를 했지요. 피씨방에서 하루종일 중국 청년들과 지내며 그들이 어떻게 게임과 핸드폰을 사용하는지 시골과 도시간이 어떻게 사용하는지도 이해할 수 있었습니다.

Through all of this qualitative evidence that I was gathering, I was starting to see so clearly that a big change was about to happen among low-income Chinese people. Even though they were surrounded by advertisements for luxury products like fancy toilets -- who wouldn't want one? -- and apartments and cars, through my conversations with them, I found out that the ads the actually enticed them the most were the ones for iPhones, promising them this entry into this high-tech life. And even when I was living with them in urban slums like this one, I saw people investing over half of their monthly income into buying a phone, and increasingly, they were "shanzhai," which are affordable knock-offs of iPhones and other brands. They're very usable. Does the job.

이렇게 제가 수집한 질적인 증거들을 통해서 중국의 저임금 사람들 사이에서 큰 변화가 생겨날 것임이 아주 명확하게 보이기 시작했어요. 비록 그들은 고급 상품 광고물에 둘러싸여 있었지만 고급 화장지 같은 상품이요. 누가 싫어하나요? 그리고 아파트, 차 그들와 함께 한 대화를 통해서 저는 이러한 광고들 중에서 실제로 가장 많이 유혹한 상품은 첨단 기술의 삶을 약속하는 아이폰이라는 것을 알게 되었어요. 그리고 이런 도시 빈민가에서 그들과 함께 살고있을 때 조차도 그들의 월급의 반 이상을 핸드폰 구매에 투자한다는 것도 목격했습니다. 그리고 이것들은 "산자이" 인데요. 아이폰과 다른 브랜드들의 값 싼 모조품입니다. 사용가능 합니다. 핸드폰의 기능을 합니다.

And after years of living with migrants and working with them and just really doing everything that they were doing, I started piecing all these data points together -- from the things that seem random, like me selling dumplings, to the things that were more obvious, like tracking how much they were spending on their cell phone bills. And I was able to create this much more holistic picture of what was happening. And that's when I started to realize that even the poorest in China would want a smartphone, and that they would do almost anything to get their hands on one.

그리고 몇년 동안 이민자들과 함께 살고, 일하고 그들이 하는 모든 것을 함께 하면서 저는 조각 조각의 데이터를 하나로 모으기 시작했어요. 제가 만두를 팔게 되는 랜덤적인 일부터 그들이 핸드폰 요금으로 얼마나 지출하는지와 같은 명확한 일까지요. 저는 일어나고 있는 일의 전체적인 그림을 만들어 낼 수 있었습니다. 그리고 이때 저는 깨달았죠. 중국의 제일 가난한 사람조차도 스마트 폰을 원한다는 것을요. 그들은 스마트 폰을 손에 넣기 위해 무슨 일이든 할 것임을요.

You have to keep in mind, iPhones had just come out, it was 2009, so this was, like, eight years ago, and Androids had just started looking like iPhones. And a lot of very smart and realistic people said, "Those smartphones -- that's just a fad. Who wants to carry around these heavy things where batteries drain quickly and they break every time you drop them?" But I had a lot of data, and I was very confident about my insights, so I was very excited to share them with Nokia.

여러분들이 명심하셔야 할 것이 아이폰이 막 출시가 되었을 때가 2009년이었습니다. 그러니까 한 8년 전의 일이네요. 안드로이드가 막 아이폰과 같은 형태로 출시되기 시작했습니다. 많은 영리하고 현실적인 사람들이 말했어요. "이런 스마트 폰들은 그냥 일시적인 유행일 뿐입니다. 이렇게 무거운 것을 누가 들고 다니고 싶어할까요? 배터리도 금방 닳고, 떨어뜨리면 부서지는데요." 그러나 저에게는 많은 데이터가 있었습니다. 그리고 저는 통찰력에 굉장히 자신감이 있었습니다. 그래서 이 데이터를 노키아와 공유하는 것이 정말 흥분되는 일이었죠.

But Nokia was not convinced, because it wasn't big data. They said, "We have millions of data points, and we don't see any indicators of anyone wanting to buy a smartphone, and your data set of 100, as diverse as it is, is too weak for us to even take seriously." And I said, "Nokia, you're right. Of course you wouldn't see this, because you're sending out surveys assuming that people don't know what a smartphone is, so of course you're not going to get any data back about people wanting to buy a smartphone in two years. Your surveys, your methods have been designed to optimize an existing business model, and I'm looking at these emergent human dynamics that haven't happened yet. We're looking outside of market dynamics so that we can get ahead of it." Well, you know what happened to Nokia? Their business fell off a cliff. This -- this is the cost of missing something. It was unfathomable.

그러나 노키아는 확신하지 않았어요. 왜냐하면 이것은 빅 데이터가 아니였기 때문이지요. 그들이 말하길, "우리에게는 수백만개의 데이터가 있는데 사람들이 스마트 폰을 원한다는 지표는 어디서든 찾아 볼 수가 없습니다. 그리고 당신의 100개의 데이터 셋이 다양하긴 하지만, 우리가 받아들이기엔 너무 약합니다." 저는 말했죠. "노키아, 당신 말이 맞아요. 물론 당신은 이것을 볼 수 없겠죠. 왜냐하면 당신은 사람들이 스마트 폰에 대해 모른다는 가정하에 설문조사를 하고 있으니까요. 그러니까 당연히 당신들은 사람들이 2년 내로 스마트 폰을 사고 싶어한다는 데이터를 얻을 수가 없는 거예요. 당신의 설문조사와 방법은 현재의 비지니스 모델을 확장시키기 위해 디자인되어 있어요. 그리고 저는 아직 일어나지 않은 새로운 인간 역학을 보고 있고요. 저희는 시장을 앞서나가기 위해 시장 밖을 내다보고 있습니다." 여러분은 노키아가 어떻게 되었는지 잘 아시죠? 그들의 사업은 절벽으로 떨어졌습니다. 이것이 무언가를 놓친 것의 대가입니다. 불가해한 일입니다.

But Nokia's not alone. I see organizations throwing out data all the time because it didn't come from a quant model or it doesn't fit in one. But it's not big data's fault. It's the way we use big data; it's our responsibility. Big data's reputation for success comes from quantifying very specific environments, like electricity power grids or delivery logistics or genetic code, when we're quantifying in systems that are more or less contained.

그러나 노키아 뿐만이 아닙니다. 기관들은 항상 데이터를 버려요. 왜냐하면 그 데이터들은 정량적인 모델로 수집되지 않았기 때문이지요. 아니면 그 모델에 들어맞지 않든가요. 그러나 이것은 빅 데이터의 잘못이 아니예요. 빅 데이터 사용 방법이 문제입니다. 우리의 책임이지요. 빅 데이터의 성공은 아주 세분화된 영역을 수량화 시키는 데에서 옵니다. 예를 들어 전력망이나, 배달물류, 유전암호 같이 변화가 별로없는 체계를 수량화 시킬 때이지요.

But not all systems are as neatly contained. When you're quantifying and systems are more dynamic, especially systems that involve human beings, forces are complex and unpredictable, and these are things that we don't know how to model so well. Once you predict something about human behavior, new factors emerge, because conditions are constantly changing. That's why it's a never-ending cycle. You think you know something, and then something unknown enters the picture. And that's why just relying on big data alone increases the chance that we'll miss something, while giving us this illusion that we already know everything.

그러나 모든 체계가 안정적이지는 않아요. 더욱 유동적인 체계를 수량화시킬 때 특히 인간과 관련한 체계를 수량화시킬 때 값은 복잡하고, 예측하기 어렵죠. 그리고 이것들이 우리가 어떻게 모델화 시켜야 할지 잘 모르는 부분입니다. 인간 행동에 대해 예측을 하면 다른 요소들이 생겨나요. 왜냐하면 상황이 계속 변화하기 때문이죠. 그래서 이것은 끝나지 않는 순환이에요. 당신은 무언가를 알고 있다고 생각합니다. 그리고, 알려지지 않은 무언가가 등장합니다. 그렇기 때문에 빅 데이터에만 의존하는 것은 우리가 모든 것을 다 알고 있다는 망상에 빠져있을 동안 무언가를 놓칠 확률을 높여줍니다.

And what makes it really hard to see this paradox and even wrap our brains around it is that we have this thing that I call the quantification bias, which is the unconscious belief of valuing the measurable over the immeasurable. And we often experience this at our work. Maybe we work alongside colleagues who are like this, or even our whole entire company may be like this, where people become so fixated on that number, that they can't see anything outside of it, even when you present them evidence right in front of their face. And this is a very appealing message, because there's nothing wrong with quantifying; it's actually very satisfying. I get a great sense of comfort from looking at an Excel spreadsheet, even very simple ones.

이런 역설을 보기 힘들게 하고 이해하기 조차도 어렵게 하는 이유는 우리는 수량화의 편견이라고 불리우는 수량화 할 수 있는 것이 수량화 할 수 없는 것보다 더 귀중하다는 무의식적인 믿음을 가지고 있기 때문이에요. 그리고 이것을 우리는 종종 일터에서 경험합니다. 우리는 이런 믿음을 가진 동료와 일하고 있을 수도 있어요. 아니면 회사 전체가 이런 믿음을 가지고 있어서 사람들이 숫자에 집착하게 되고 심지어 증거물들을 코 앞에 제시해 주어도 숫자 이외의 것은 볼 수 없게 되죠. 이것은 정말 흥미로운 메세지입니다. 왜냐하면 수량화하는 것이 옳지 않은 것은 아니예요. 사실 굉장히 뿌듯하죠. 저도 엑셀 스프레드 시트를 보면, 심지어 아주 간단한 것들만 보아도 굉장히 뿌듯해요.

(Laughter)

(웃음)

It's just kind of like, "Yes! The formula worked. It's all OK. Everything is under control."

이런 종류의 뿌듯함이죠. "와! 공식이 잘 세워졌다. 다 괜찮아. 모든 것이 잘 관리되고 있어."

But the problem is that quantifying is addictive. And when we forget that and when we don't have something to kind of keep that in check, it's very easy to just throw out data because it can't be expressed as a numerical value. It's very easy just to slip into silver-bullet thinking, as if some simple solution existed. Because this is a great moment of danger for any organization, because oftentimes, the future we need to predict -- it isn't in that haystack, but it's that tornado that's bearing down on us outside of the barn. There is no greater risk than being blind to the unknown. It can cause you to make the wrong decisions. It can cause you to miss something big.

그러나 문제는 수량화하는 것은 중독성이 있어요. 그리고 우리가 그것을 잊어버리고 계속 체크할 거리가 없어지면 그 데이터들을 그냥 버려지기 쉬워요. 왜냐하면 숫자로 표현될 수가 없으니까요. 마치 아주 간단한 해결책이 존재했던것 처럼 이것이 묘책이라고 생각하기 쉬워요. 왜냐하면 이것이 기관에게는 아주 위험한 순간이기 때문이고 종종 우리가 예언해야 할 미래는 건초더미에 있지 않기 때문이죠. 이것은 우리를 향해 외양간 밖에서 돌진하는 토네이도입니다. 알려지지 않은 것들이 있음을 깨닫지 못하는 것보다 더 큰 위험은 없어요. 잘못된 결정을 하게 만들죠. 아주 큰 무언가를 놓치게 만들죠.

But we don't have to go down this path. It turns out that the oracle of ancient Greece holds the secret key that shows us the path forward. Now, recent geological research has shown that the Temple of Apollo, where the most famous oracle sat, was actually built over two earthquake faults. And these faults would release these petrochemical fumes from underneath the Earth's crust, and the oracle literally sat right above these faults, inhaling enormous amounts of ethylene gas, these fissures.

그러나 우리가 똑같이 하지 않아도 돼요. 고대 그리스의 오라클에게 미래를 보여주는 비밀 열쇠가 있습니다. 최근 지리연구에 의하면 가장 유명한 오라클이 있는 아폴로 신전이 사실 두번의 지진단층을 겪으며 세워졌다고 합니다. 그리고 이런 단층은 지각 밑에서 부터 석유화학 가스를 방출합니다. 그리고 오라클은 말그대로 이 단층 바로 위에 모셔져 있었어요. 아주 많은 양의 에틸렌 가스를 흡입하면서요.

(Laughter)

(웃음)

It's true.

사실이예요.

(Laughter) It's all true, and that's what made her babble and hallucinate and go into this trance-like state. She was high as a kite!

(웃음) 다 사실이에요. 그래서 그녀는 횡설수설하며, 환각을 느낄 수 있었고 가수상태와 비슷한 상태에 빠질 수 있었습니다. 그녀는 완전히 취해있었어요!

(Laughter)

(웃음)

So how did anyone -- How did anyone get any useful advice out of her in this state? Well, you see those people surrounding the oracle? You see those people holding her up, because she's, like, a little woozy? And you see that guy on your left-hand side holding the orange notebook? Well, those were the temple guides, and they worked hand in hand with the oracle. When inquisitors would come and get on their knees, that's when the temple guides would get to work, because after they asked her questions, they would observe their emotional state, and then they would ask them follow-up questions, like, "Why do you want to know this prophecy? Who are you? What are you going to do with this information?" And then the temple guides would take this more ethnographic, this more qualitative information, and interpret the oracle's babblings. So the oracle didn't stand alone, and neither should our big data systems.

그려면 어떻게 어떻게 이런 상태에서 유용한 조언을 받을 수가 있었을까요? 오라클 주변을 둘러싼 이 사람들 보이시나요? 이 사람들이 그녀를 붙잡고 있어요. 그녀의 머리가 조금 띵해서 일까요? 그리고 왼쪽의 남자가 보이죠. 오렌지 색의 공책을 들고 있어요. 이것은 신전 가이드 입니다. 그리고 이들은 오라클과 손 잡고 일했어요. 질문자들이 와서 무릎을 꿇으면 그때가 신전 가이드가 필요한 떄 입니다. 왜냐하면 그녀에게 질문을 한 후 그들은 이 질문자의 감정상태를 관찰합니다. 그리고 이런 후속 질문을 합니다. "왜 이 예언을 알고싶어 하십니까? 당신은 누구십니까? 이 정보로 무엇을 할 것입니까?" 그리고 신전 가이드는 이런 민족지적인 더 질적인 정보를 담게 되고 오라클의 횡설수설한 대답을 해석해 줍니다. 그래서 오라클은 혼자 하지 않았어요. 그리고 우리의 빅 데이터 시스템도 홀로 있으면 안돼요.

Now to be clear, I'm not saying that big data systems are huffing ethylene gas, or that they're even giving invalid predictions. The total opposite. But what I am saying is that in the same way that the oracle needed her temple guides, our big data systems need them, too. They need people like ethnographers and user researchers who can gather what I call thick data. This is precious data from humans, like stories, emotions and interactions that cannot be quantified. It's the kind of data that I collected for Nokia that comes in in the form of a very small sample size, but delivers incredible depth of meaning.

분명하게 이야기 하자면 저는 빅 데이터 시스템이 에틸렌 가스를 흡입하고 있다고 말하는 것이 아닙니다. 무효한 예견을 한다고 말하는 것도 아니고요. 아주 반대예요. 제가 이야기 하고자 하는 것은 오라클이 신전 가이드가 필요했던 것과 같이 우리의 빅 데이터 시스템도 그런 가이드가 필요하다는 것입니다. 빅 데이터 시스템은 민족지학자나 이용자 연구원과 같이 심층적 데이터를 수집할 수 있는 사람이 필요합니다. 이것은 사람에게서 나오는 소중한 데이터 인데요. 이야기, 감정 그리고 의사소통과 같이 수량화 될 수 없는 것들이죠. 그것이 제가 노키아를 위해 수집했던 데이터입니다. 아주 작은 표본에서 수집 되었지만 엄청난 깊이의 의미를 전달하죠.

And what makes it so thick and meaty is the experience of understanding the human narrative. And that's what helps to see what's missing in our models. Thick data grounds our business questions in human questions, and that's why integrating big and thick data forms a more complete picture. Big data is able to offer insights at scale and leverage the best of machine intelligence, whereas thick data can help us rescue the context loss that comes from making big data usable, and leverage the best of human intelligence. And when you actually integrate the two, that's when things get really fun, because then you're no longer just working with data you've already collected. You get to also work with data that hasn't been collected. You get to ask questions about why: Why is this happening?

그리고 이 데이터를 아주 심층적이고 알차게 만드는 것은 인간 이야기를 이해하는 경험입니다. 그것이 우리 모델에서 빠진 부분을 볼 수 있게 도와줍니다. 심층적 데이터는 사업적인 질문에게 인간적인 질문을 부여합니다. 그래서 빅 테이터와 심층적 데이터의 통합이 완벽한 그림을 그려냅니다. 빅 데이터는 통찰력을 척도로 제공할 수 있으며 기계 지능을 최대로 이용할 수 있습니다. 반면 심층적 데이터는 빅 데이터를 사용할 수 있게 함으로 생기는 문맥적 손실을 줄이는데 도움을 줘서 인간 지능을 최대로 이용할 수 있게 해줍니다. 실제로 이 두 데이터를 통합할 때 굉장히 흥미로워집니다. 왜냐하면 그때는 수집한 데이터만 가지고 일을 하는 것이 아니고 수집하지 않은 데이터까지 함께 가지고 일을 하게 되는 것이기 때문입니다. '왜' 라는 질문을 하게 되지요. 왜 이것이 일어나는 것일까?

Now, when Netflix did this, they unlocked a whole new way to transform their business. Netflix is known for their really great recommendation algorithm, and they had this $1 million prize for anyone who could improve it. And there were winners. But Netflix discovered the improvements were only incremental. So to really find out what was going on, they hired an ethnographer, Grant McCracken, to gather thick data insights. And what he discovered was something that they hadn't seen initially in the quantitative data. He discovered that people loved to binge-watch. In fact, people didn't even feel guilty about it. They enjoyed it.

넷플릭스가 이것을 했을 때 그들은 그들의 사업을 변화시킬 새로운 길을 열었어요. 넷플릭스는 아주 훌륭한 추천 알고리즘이 있는 것으로 알려져 있는데요. 그들은 이 알고리즘을 향상시키는데에 백만 달러의 상금을 걸었습니다. 그리고 우승자가 있었습니다. 그러나 넥플릭스는 이런 향상이 오직 양적인 증가임을 발견했고 정말 무엇이 일어나고 있는지 알아내기 위해 민족지학자인 그랜트 맥크래켄을 고용하여 심층적 데이터를 수집했어요. 그리고 그는 그들이 수량화된 데이터에서 초기에 보지 못했던 점을 발견하게 되었습니다. 그는 사람들이 몰아보기를 좋아한다는 것을 발견했어요. 사실, 사람들은 몰아보는 것에 죄책감을 느끼지도 않아요. 그것을 즐깁니다.

(Laughter)

(웃음)

So Netflix was like, "Oh. This is a new insight." So they went to their data science team, and they were able to scale this big data insight in with their quantitative data. And once they verified it and validated it, Netflix decided to do something very simple but impactful. They said, instead of offering the same show from different genres or more of the different shows from similar users, we'll just offer more of the same show. We'll make it easier for you to binge-watch. And they didn't stop there. They did all these things to redesign their entire viewer experience, to really encourage binge-watching. It's why people and friends disappear for whole weekends at a time, catching up on shows like "Master of None." By integrating big data and thick data, they not only improved their business, but they transformed how we consume media. And now their stocks are projected to double in the next few years.

그래서 넷플릭스는 "오, 이것은 새로운 시각이다." 그래서 그들은 그들의 데이터 과학 팀에게 가서 그들의 수량화된 데이터와 함께 빅 데이터를 측정 할 수 있었어요. 확인하고, 입증한 후에 넷플릭스는 아주 단순하면서도 영향력 있는 것을 하기로 결정했어요. 그들이 말하길, 다른 장르의 같은 쇼나 비슷한 이용자에서의 다른 쇼들을 제공하는 것 대신에 우리는 그냥 같은 쇼를 계속 제공할 것입니다. 우리는 당신이 더 쉽게 몰아보게 할 것입니다. 그리고 그들은 거기서 멈추지 않았어요. 그들은 모든 시청자들의 경험을 다시 디자인하기 위해, 몰아보기를 정말로 장려하기 위해 이 모든 것들을 하였습니다. 이것이 사람들과 친구들이 주말마다 가끔씩 사라지는 이유입니다. "Master of None"과 같은 쇼를 몰아보기 때문이죠. 빅 데이터와 심층적 데이터를 통합함으로써, 사업을 개선시켰을 뿐만아니라 우리가 어떻게 미디어를 받아들이는지를 변화시켰어요. 지금 그 주식은 앞으로 몇년 안에 두 배가 뛸 것으로 예상하고 있어요.

But this isn't just about watching more videos or selling more smartphones. For some, integrating thick data insights into the algorithm could mean life or death, especially for the marginalized. All around the country, police departments are using big data for predictive policing, to set bond amounts and sentencing recommendations in ways that reinforce existing biases. NSA's Skynet machine learning algorithm has possibly aided in the deaths of thousands of civilians in Pakistan from misreading cellular device metadata. As all of our lives become more automated, from automobiles to health insurance or to employment, it is likely that all of us will be impacted by the quantification bias.

그러나 이것은 더 많은 비디오를 시청하고 더 많은 스마트 폰을 파는 것에 국한된 것이 아닙니다. 심층적 데이터를 알고리즘에 통합하는 것이 몇몇의 사람들에게는 삶이나 죽음을 의미할 수도 있어요. 특히 소외된 사람들에게 말이죠. 모든 나라의 경찰서에서는 범죄예측을 위해 빅 데이터를 사용합니다. 보석금과 판결 권고를 현존하는 편향을 보강하는 방향으로 설정합니다. NSA의 스카이넷 머신 러닝 알고리즘은 휴대폰 메타데이터를 잘못 읽어서 수천 명의 파키스탄 시민들을 죽게 했습니다 우리 모든 삶이 자동차부터 의료보험, 직장까지 더욱 자동화가 되어가면서 우리 모두 수량화의 편견으로부터 영향을 받게 될 것입니다.

Now, the good news is that we've come a long way from huffing ethylene gas to make predictions. We have better tools, so let's just use them better. Let's integrate the big data with the thick data. Let's bring our temple guides with the oracles, and whether this work happens in companies or nonprofits or government or even in the software, all of it matters, because that means we're collectively committed to making better data, better algorithms, better outputs and better decisions. This is how we'll avoid missing that something.

좋은 소식은 우리는 에탈렌 가스흡입으로 예언을 하는 것에서 부터 먼 길을 왔습니다. 우리에겐 더 좋은 도구들이 있어요. 그러니까 더 좋게 이용합시다. 빅 데이터를 심층적 데이터와 통합합시다. 오라클과 함께 있던 신전 가이드를 가져옵시다. 그리고 이런 일이 일어나는 곳이 기업이든, 비영리 기관이든, 정부이든, 심지어 소프트웨어이든지 모두 중요합니다. 왜냐하면 이것은 우리가 다함께 더 나은 데이터 더 나은 알고리즘, 더 나은 결과와 더 나은 결정을 내리는 것에 전념하고 있음을 의미하기 때문이죠. 이렇게 하면 중요한 것을 놓치지 않게 될 겁니다.

(Applause)

(박수)

(Laughter)

(웃음)

It's just kind of like, "Yes! The formula worked. It's all OK. Everything is under control."

이런 종류의 뿌듯함이죠. "와! 공식이 잘 세워졌다. 다 괜찮아. 모든 것이 잘 관리되고 있어."

(Laughter)

(웃음)

It's true.

사실이예요.

(Laughter) It's all true, and that's what made her babble and hallucinate and go into this trance-like state. She was high as a kite!

(Laughter)

(웃음)

(Laughter)

(웃음)

(Applause)

(박수)

Tricia Wang: The human insights missing from big data

Tricia Wang: The human insights missing from big data

Related talks

Mona Chalabi: 3 ways to spot a bad statistic

Sebastian Wernicke: How to use data to make a hit TV show

Mallory Freeman: Your company's data could help end world hunger

Giorgia Lupi: How we can find ourselves in data

Madhumita Murgia: How data brokers sell your identity

Jer Thorp: Make data more human

Related talks

Mona Chalabi: 3 ways to spot a bad statistic

Sebastian Wernicke: How to use data to make a hit TV show

Mallory Freeman: Your company's data could help end world hunger

Giorgia Lupi: How we can find ourselves in data

Madhumita Murgia: How data brokers sell your identity

Jer Thorp: Make data more human