Jennifer Golbeck: Your social media "likes" expose more than you think

Αν θυμάστε την πρώτη δεκαετία του διαδικτύου, ήταν ένα πραγματικά στατικό μέρος. Μπορούσατε να συνδεθείτε, να δειτε σελίδες, και τις είχαν ανεβάσει είτε οργανισμοί που είχαν ομάδες γι' αυτή τη δουλειά, ή άτομα που ήταν πραγματικά γνώστες της τεχνολογίας της εποχής. Mε την άνοδο των κοινωνικών μέσων και δικτύωσης στην αρχή της δεκαετίας του 2000, το διαδίκτυο άλλαξε ολοκληρωτικά και έγινε ένα μέρος όπου η πλειονότητα του περιεχομένου με το οποίο αλληλεπιδρούμε έχει ανέβει από μέσης ικανότητας χρήστες, είτε στο YouTube, ή σε αναρτήσεις μπλογκ, ή κριτικές προϊόντων, ή ανακοινώσεις στα κοινωνικά μέσα. Έγινε επίσης πολύ πιο διαδραστικό μέρος, όπου άνθρωποι αλληλεπιδρούν με άλλους, σχολιάζουν, μοιράζονται, δεν διαβάζουν απλώς.

If you remember that first decade of the web, it was really a static place. You could go online, you could look at pages, and they were put up either by organizations who had teams to do it or by individuals who were really tech-savvy for the time. And with the rise of social media and social networks in the early 2000s, the web was completely changed to a place where now the vast majority of content we interact with is put up by average users, either in YouTube videos or blog posts or product reviews or social media postings. And it's also become a much more interactive place, where people are interacting with others, they're commenting, they're sharing, they're not just reading.

Το Facebook δεν είναι το μόνο τέτοιο μέρος αλλά είναι το μεγαλύτερο και βοηθάει να δούμε τα νούμερα. Το Facebook έχει 1,2 δις χρήστες τον μήνα. Άρα ο μισός πληθυσμός της Γης χρησιμοποιεί το Facebook. Είναι ένας ιστοχώρος, μαζί με άλλους, που επέτρεψε στους ανθρώπους να φτιάξουν μια διαδικτυακή προσωπικότητα με ελάχιστες τεχνικές ικανότητες, και οι άνθρωποι ανταποκρίθηκαν ανεβάζοντας τεράστιες ποσότητες προσωπικών δεδομένων. Συνεπώς έχουμε δεδομένα συμπεριφοράς, προτιμήσεων, δημογραφικά για εκατοντάδες εκατομμύρια ανθρώπους, το οποίο είναι άνευ προηγουμένου. Και ως επιστήμονας πληροφορικής, αυτό σημαίνει ότι έχω μπορέσει να χτίσω μοντέλα ικανά να προβλέψουν όλων των ειδών κρυμμένα χαρακτηριστικά για όλους εσάς και που ούτε καν γνωρίζετε ότι μοιράζεστε πληροφορίες γι' αυτά. Ως επιστήμονες, τα χρησιμοποιούμε για να διευκολύνουμε την αλληλεπίδραση των συνδεδεμένων χρηστών, αλλά υπάρχουν υστερόβουλες εφαρμογές και το πρόβλημα είναι ότι οι χρήστες δεν καταλαβαίνουν αυτές τις τεχνικές και πώς λειτουργούν. όμως ακόμη και αν καταλάβαιναν, δεν μπορούν να τις ελέγξουν. Έτσι σήμερα θέλω να σας μιλήσω για κάποια πράγματα που μπορούμε να κάνουμε, και μετά να δούμε κάποιες ιδέες για το πώς να προχωρήσουμε και να φέρουμε μέρος του ελέγχου πίσω στους χρήστες.

So Facebook is not the only place you can do this, but it's the biggest, and it serves to illustrate the numbers. Facebook has 1.2 billion users per month. So half the Earth's Internet population is using Facebook. They are a site, along with others, that has allowed people to create an online persona with very little technical skill, and people responded by putting huge amounts of personal data online. So the result is that we have behavioral, preference, demographic data for hundreds of millions of people, which is unprecedented in history. And as a computer scientist, what this means is that I've been able to build models that can predict all sorts of hidden attributes for all of you that you don't even know you're sharing information about. As scientists, we use that to help the way people interact online, but there's less altruistic applications, and there's a problem in that users don't really understand these techniques and how they work, and even if they did, they don't have a lot of control over it. So what I want to talk to you about today is some of these things that we're able to do, and then give us some ideas of how we might go forward to move some control back into the hands of users.

Αυτή είναι η εταιρεία Target. Δεν έβαλα εγώ το λογότυπο στην κοιλιά της άμοιρης εγκύου. Ίσως έχετε δει το ανέκδοτο που δημοσιεύτηκε στο περιοδικό Forbes όπου η Target έστειλε φυλλάδιο σε ένα 15χρονο κορίτσι με διαφημίσεις και εκπτωτικά κουπόνια για μπιμπερό, πάνες και κούνιες δύο εβδομάδες πριν πει στους γονείς της ότι ήταν έγκυος. Ναι, ο πατέρας της πραγματικά αναστατώθηκε. Είπε, «Πώς κατάλαβε η Target ότι μια μαθήτρια λυκείου ήταν έγκυος πριν αυτή να το πει στους γονείς της;» Αποδεικνύεται ότι έχουν το ιστορικό αγορών για εκατοντάδες χιλιάδες πελάτες και υπολογίζουν τον αποκαλούμενο δείκτη εγκυμοσύνης, που δεν είναι απλά αν μια γυναίκα είναι έγκυος, αλλά ποια είναι η ημέρα τοκετού. Και αυτό το υπολογίζουν όχι κοιτάζοντας προφανή πράγματα, εάν αγοράζει κούνια ή μωρουδιακά, αλλά πράγματα όπως, αγόρασε περισσότερες βιταμίνες από ότι αγόραζε συνήθως, ή αγόρασε μια τσάντα αρκετά μεγάλη για πάνες μωρού. Και από μόνες τους αυτές οι αγορές, δεν φαίνονται να φανερώνουν πολλά, αλλά είναι ένα πρότυπο συμπεριφοράς που, αν το συνυπολογίσετε με αυτό χιλιάδων άλλων ανθρώπων, αρχίζει πραγματικά να αποκαλύπτει πολλά. Αυτό λοιπόν είναι που κάνουμε όταν κάνουμε προβλέψεις για εσάς στα κοινωνικά δίκτυα. Ψάχνουμε για μικρά πρότυπα συμπεριφοράς που αν ανιχνευτούν σε εκατομμύρια ανθρώπων μας επιτρέπουν να βρούμε όλων των ειδών τα πράγματα.

So this is Target, the company. I didn't just put that logo on this poor, pregnant woman's belly. You may have seen this anecdote that was printed in Forbes magazine where Target sent a flyer to this 15-year-old girl with advertisements and coupons for baby bottles and diapers and cribs two weeks before she told her parents that she was pregnant. Yeah, the dad was really upset. He said, "How did Target figure out that this high school girl was pregnant before she told her parents?" It turns out that they have the purchase history for hundreds of thousands of customers and they compute what they call a pregnancy score, which is not just whether or not a woman's pregnant, but what her due date is. And they compute that not by looking at the obvious things, like, she's buying a crib or baby clothes, but things like, she bought more vitamins than she normally had, or she bought a handbag that's big enough to hold diapers. And by themselves, those purchases don't seem like they might reveal a lot, but it's a pattern of behavior that, when you take it in the context of thousands of other people, starts to actually reveal some insights. So that's the kind of thing that we do when we're predicting stuff about you on social media. We're looking for little patterns of behavior that, when you detect them among millions of people, lets us find out all kinds of things.

Έτσι, στο εργαστήριο με συναδέλφους, αναπτύξαμε μηχανισμούς ακριβούς πρόβλεψης πραγμάτων όπως πολιτικές προτιμήσεις, δείκτες προσωπικότητας, γένος, σεξουαλικό προσανατολισμό, θρησκεία, ηλικία, ευφυΐα, αλλά και πράγματα όπως πόσο εμπιστεύεστε τους οικείους σας και πόσο δυνατές είναι οι σχέσεις σας. Όλα αυτά μπορούμε να τα κάνουμε πολύ καλά. Και ξανά, δεν προέρχονται από κάτι που εσείς θεωρείτε προφανή πληροφόρηση.

So in my lab and with colleagues, we've developed mechanisms where we can quite accurately predict things like your political preference, your personality score, gender, sexual orientation, religion, age, intelligence, along with things like how much you trust the people you know and how strong those relationships are. We can do all of this really well. And again, it doesn't come from what you might think of as obvious information.

Το αγαπημένο μου παράδειγμα είναι από μια μελέτη που δημοσιεύτηκε φέτος στις Διαδικασίες των Εθνικών Ακαδημιών. Aναζητήστε το στο Google. Είναι 4 σελίδες, ευανάγνωστο. Εξέτασαν μόνο τα like των ανθρώπων στο Facebook, άρα μόνο αυτά που σας αρέσουν στο Facebook και τα χρησιμοποίησαν για να προβλέψουν όλα αυτά τα χαρακτηριστικά, μαζί με κάποια άλλα. Και στην εργασία τους παρουσίασαν τα πέντε like που ήταν πιο ενδεικτικά υψηλής νοημοσύνης. Ανάμεσα σε αυτά ήταν να σας αρέσει μια σελίδα για σγουρές πατάτες. (Γέλια) Οι σγουρές πατάτες είναι νοστιμότατες, αλλά το να σου αρέσουν δεν σημαίνει απαραίτητα ότι είσαι εξυπνότερος από τον μέσο άνθρωπο. Πώς γίνεται λοιπόν, μία από τις ισχυρότερες ενδείξεις ευφυίας να είναι το να σου αρέσει αυτή η σελίδα όταν το περιεχόμενο είναι τελείως άσχετο με το χαρακτηριστικό που αποδίδεται; Αποδεικνύεται ότι πρέπει να δούμε ένα πλήθος υποκειμένων θεωριών για να δούμε γιατί μπορούμε να το κάνουμε. Μία είναι η κοινωνιολογική θεωρία που λέγεται ομοφυλία, η οποία βασικά λέει ότι οι άνθρωποι είναι φίλοι με όμοιούς τους. Ο έξυπνος τείνει να έχει έξυπνους φίλους, και ο νέος τείνει να είναι φίλος με νέους ανθρώπους, και αυτό είναι αποδεδειγμένο εδώ και εκατοντάδες χρόνια. Επίσης γνωρίζουμε πολλά για τον τρόπο μετάδοσης της πληροφορίας μέσα από τα δίκτυα. Φαίνεται ότι τα ιότροπα βίντεο ή τα like στο Facebook ή άλλες πληροφορίες διαδίδονται ακριβώς με τον ίδιο τρόπο που εξαπλώνονται οι ιώσεις στα κοινωνικά δίκτυα. Το έχουμε μελετήσει για πολύ καιρό. Έχουμε καλά μοντέλα για αυτό. Και έτσι μπορείτε να συνδυάσετε περιπτώσεις και να αρχίσετε να καταλαβαίνετε γιατί συμβαίνουν τέτοια πράγματα. Αν έπρεπε να κάνω μια υπόθεση, θα έλεγα ότι ένας έξυπνος άνθρωπος ξεκίνησε αυτή τη σελίδα, ή ότι ένας από τους πρώτους που του άρεσε η σελίδα είχε αριστεύσει σε κάποιο τεστ. Και έκαναν like, οι φίλοι τους το είδαν, και από την ομοφυλία, ξέρουμε ότι μάλλον είχαν έξυπνους φίλους, έτσι διαδόθηκε σε αυτούς, σε κάποιους από αυτούς άρεσε, αυτοί είχαν έξυπνους φίλους, διαδόθηκε και σε αυτούς, και έτσι εξαπλώθηκε μέσα από το δίκτυο σε μεγάλες ομάδες έξυπνων ανθρώπων, έτσι ώστε τελικά, το like στη σελίδα της σγουρής πατάτας έγινε ένδειξη υψηλής ευφυίας, όχι εξαιτίας του περιεχομένου, αλλά επειδή η πράξη καθαυτή του να πατήσεις like αντανακλά κοινά χαρακτηριστικά με άλλους ανθρώπους που έκαναν το ίδιο.

So my favorite example is from this study that was published this year in the Proceedings of the National Academies. If you Google this, you'll find it. It's four pages, easy to read. And they looked at just people's Facebook likes, so just the things you like on Facebook, and used that to predict all these attributes, along with some other ones. And in their paper they listed the five likes that were most indicative of high intelligence. And among those was liking a page for curly fries. (Laughter) Curly fries are delicious, but liking them does not necessarily mean that you're smarter than the average person. So how is it that one of the strongest indicators of your intelligence is liking this page when the content is totally irrelevant to the attribute that's being predicted? And it turns out that we have to look at a whole bunch of underlying theories to see why we're able to do this. One of them is a sociological theory called homophily, which basically says people are friends with people like them. So if you're smart, you tend to be friends with smart people, and if you're young, you tend to be friends with young people, and this is well established for hundreds of years. We also know a lot about how information spreads through networks. It turns out things like viral videos or Facebook likes or other information spreads in exactly the same way that diseases spread through social networks. So this is something we've studied for a long time. We have good models of it. And so you can put those things together and start seeing why things like this happen. So if I were to give you a hypothesis, it would be that a smart guy started this page, or maybe one of the first people who liked it would have scored high on that test. And they liked it, and their friends saw it, and by homophily, we know that he probably had smart friends, and so it spread to them, and some of them liked it, and they had smart friends, and so it spread to them, and so it propagated through the network to a host of smart people, so that by the end, the action of liking the curly fries page is indicative of high intelligence, not because of the content, but because the actual action of liking reflects back the common attributes of other people who have done it.

Αρκετά περίπλοκη υπόθεση, σωστά; Είναι δύσκολο να καθίσεις να το εξηγήσεις στον μέσο χρήστη, αλλά και αν το κάνεις, τι μπορεί να κάνει ο μέσος χρήστης; Πώς μπορείς να ξέρεις ότι σου άρεσε κάτι που υποδηλώνει ένα χαρακτηριστικό σου που είναι τελείως άσχετο με το περιεχόμενο που σου άρεσε; Υπάρχει πολλή δύναμη που οι χρήστες δεν έχουν για να ελέγξουν πώς χρησιμοποιούνται αυτά τα δεδομένα. Και το βλέπω ως ένα σοβαρό πρόβλημα που ελλοχεύει.

So this is pretty complicated stuff, right? It's a hard thing to sit down and explain to an average user, and even if you do, what can the average user do about it? How do you know that you've liked something that indicates a trait for you that's totally irrelevant to the content of what you've liked? There's a lot of power that users don't have to control how this data is used. And I see that as a real problem going forward.

Έτσι νομίζω ότι υπάρχουν κάποιοι δρόμοι που πρέπει να δούμε για να δώσουμε στους χρήστες κάποιον έλεγχο στη χρήση των δεδομένων, επειδή δεν θα χρησιμοποιούνται πάντα προς όφελός τους. Λέω συχνά σαν παράδειγμα ότι, αν ποτέ βαρεθώ να είμαι καθηγήτρια, θα φτιάξω μια εταιρεία που θα προβλέπει στοιχεία όπως, πώς αποδίδετε σε ομαδική εργασία, αν είστε χρήστης ουσιών, αν είστε αλκοολικός. Ξέρουμε πώς να τα προβλέψουμε. Και θα πουλάω αναφορές σε εταιρείες ανθρώπινου δυναμικού και επιχειρήσεις που θέλουν να σας προσλάβουν. Μπορούμε να το κάνουμε άμεσα. Μπορώ να το ξεκινήσω αύριο, και δεν θα έχετε απολύτως κανένα έλεγχο σε εμένα που χρησιμοποιώ τα δεδομένα σας έτσι. Εμένα μου ακούγεται σαν πρόβλημα.

So I think there's a couple paths that we want to look at if we want to give users some control over how this data is used, because it's not always going to be used for their benefit. An example I often give is that, if I ever get bored being a professor, I'm going to go start a company that predicts all of these attributes and things like how well you work in teams and if you're a drug user, if you're an alcoholic. We know how to predict all that. And I'm going to sell reports to H.R. companies and big businesses that want to hire you. We totally can do that now. I could start that business tomorrow, and you would have absolutely no control over me using your data like that. That seems to me to be a problem.

Έτσι ένας από τους δρόμους που έχουμε είναι ο πολιτικός και νομικός δρόμος. Από κάποιες απόψεις, νομίζω ότι είναι ο πιο αποτελεσματικός, αλλά το πρόβλημα είναι ότι θα πρέπει να τον «ανοίξουμε». Παρατηρώντας την πολιτική διαδικασία στην πράξη με κάνει να σκέφτομαι ότι είναι μάλλον απίθανο να πείσουμε μια ομάδα βουλευτών να καθίσουν, να ενημερωθούν, και μετά να επιφέρουν ριζικές αλλαγές στον νόμο περί πνευματικής ιδιοκτησίας των ΗΠΑ. ώστε οι χρήστες να ελέγχουν τα δεδομένα τους.

So one of the paths we can go down is the policy and law path. And in some respects, I think that that would be most effective, but the problem is we'd actually have to do it. Observing our political process in action makes me think it's highly unlikely that we're going to get a bunch of representatives to sit down, learn about this, and then enact sweeping changes to intellectual property law in the U.S. so users control their data.

Υπάρχει και ο δρόμος της πολιτικής, όπου οι εταιρείες των κοινωνικών μέσων λένε, «Δικά σου είναι τα δεδομένα. Έχεις πλήρη έλεγχο στο πώς χρησιμοποιούνται». Το πρόβλημα είναι ότι τα μοντέλα εσόδων των εταιρειών κοινωνικών μέσων βασίζονται στην αποκάλυψη ή εκμετάλλευση των δεδομένων των χρηστών. Λέγεται για το Facebook ότι οι χρήστες δεν είναι ο πελάτης, είναι το προϊόν. Πώς λοιπόν θα καταφέρεις μια εταιρεία να εκχωρήσει τον έλεγχο του βασικού κεφαλαίου της πίσω στους χρήστες; Είναι πιθανόν, αλλά δεν νομίζω ότι είναι κάτι που θα δούμε να αλλάζει σύντομα.

We could go the policy route, where social media companies say, you know what? You own your data. You have total control over how it's used. The problem is that the revenue models for most social media companies rely on sharing or exploiting users' data in some way. It's sometimes said of Facebook that the users aren't the customer, they're the product. And so how do you get a company to cede control of their main asset back to the users? It's possible, but I don't think it's something that we're going to see change quickly.

Πιστεύω ότι ο άλλος δρόμος που θα είναι και πιο αποτελεσματικός, είναι της περισσότερης επιστήμης. Είναι η επιστήμη που μας επέτρεψε να αναπτύξουμε μηχανισμούς υπολογισμού των προσωπικών δεδομένων εξ αρχής. Στην ουσία είναι παρόμοια η έρευνα που πρέπει να κάνουμε αν θέλουμε να εξελίξουμε μηχανισμούς που να μπορούν να πουν στον χρήστη, «Αυτό είναι το ρίσκο της πράξης σου. Κάνοντας like σε αυτή τη σελίδα του Facebook ή κοινοποιώντας αυτή την προσωπική πληροφορία, βελτίωσες την ικανότητά μου να προβλέψω κατά πόσον είσαι χρήστης ουσιών ή πώς τα πας στον χώρο εργασίας σου». Και αυτό νομίζω μπορεί να επηρεάσει το κατά πόσον θα μοιραστούν κάτι, θα το διατηρήσουν ιδιωτικό, ή θα το αφήσουν τελείως εκτός δικτύου. Μπορούμε να δούμε πράγματα όπως να μπορούν οι χρήστες να κρυπτογραφούν τα δεδομένα, έτσι ώστε να είναι αόρατα ή άνευ αξίας σε σελίδες όπως το Facebook ή σε υπηρεσίες τρίτων που έχουν πρόσβαση σε αυτά, αλλά οι επιλεγμένοι χρήστες που το άτομο που έκανε την ανάρτηση θέλει να τα δουν, να μπορούν να τα δουν. Είναι πάρα πολύ συναρπαστική έρευνα από διαλεκτικής άποψης, γι' αυτό οι επιστήμονες θα συνεργαστούν. Έτσι αυτό μας δίνει πλεονέκτημα έναντι της νομικής πλευράς.

So I think the other path that we can go down that's going to be more effective is one of more science. It's doing science that allowed us to develop all these mechanisms for computing this personal data in the first place. And it's actually very similar research that we'd have to do if we want to develop mechanisms that can say to a user, "Here's the risk of that action you just took." By liking that Facebook page, or by sharing this piece of personal information, you've now improved my ability to predict whether or not you're using drugs or whether or not you get along well in the workplace. And that, I think, can affect whether or not people want to share something, keep it private, or just keep it offline altogether. We can also look at things like allowing people to encrypt data that they upload, so it's kind of invisible and worthless to sites like Facebook or third party services that access it, but that select users who the person who posted it want to see it have access to see it. This is all super exciting research from an intellectual perspective, and so scientists are going to be willing to do it. So that gives us an advantage over the law side.

Ένα από τα προβλήματα που οι άνθρωποι αναφέρουν όταν μιλάω γι' αυτό είναι, «Αν οι άνθρωποι αρχίσουν να έχουν ιδιωτικά δεδομένα, όλες αυτές οι μέθοδοι που αναπτύσσεις για πρόβλεψη χαρακτηριστικών θα αποτύχουν». Και εγώ απαντώ, σίγουρα, και αυτό για μένα είναι επιτυχία, επειδή ως επιστήμονας, στόχος μου δεν είναι να συνάγω πληροφορίες σχετικά με τους χρήστες, είναι να βελτιώνω τον τρόπο αλληλεπίδρασής τους στο δίκτυο. Μερικές φορές αυτό εμπλέκει και την εξεύρεση στοιχείων για αυτούς, αλλά αν οι χρήστες δεν θέλουν να χρησιμοποιώ αυτά τα δεδομένα, θα πρέπει να έχουν αυτό το δικαίωμα. Θέλω οι χρήστες να είναι ενημερωμένοι και να εγκρίνουν τα εργαλεία που αναπτύσσουμε.

One of the problems that people bring up when I talk about this is, they say, you know, if people start keeping all this data private, all those methods that you've been developing to predict their traits are going to fail. And I say, absolutely, and for me, that's success, because as a scientist, my goal is not to infer information about users, it's to improve the way people interact online. And sometimes that involves inferring things about them, but if users don't want me to use that data, I think they should have the right to do that. I want users to be informed and consenting users of the tools that we develop.

Και νομίζω ότι ενθαρρύνοντας αυτό το είδος της επιστήμης, και στηρίζοντας τους ερευνητές που θέλουν να επιστρέψουν τον έλεγχο πίσω στους χρήστες από τις εταιρείες κοινωνικών μέσων, σημαίνει ότι προοδεύουμε, καθώς τα εργαλεία αυτά εξελίσσονται και βελτιώνονται, σημαίνει ότι θα έχουμε μορφωμένους και δυναμικούς χρήστες, και νομίζω όλοι συμφωνούμε ότι αυτός είναι ο ιδανικός τρόπος να προοδεύσουμε.

And so I think encouraging this kind of science and supporting researchers who want to cede some of that control back to users and away from the social media companies means that going forward, as these tools evolve and advance, means that we're going to have an educated and empowered user base, and I think all of us can agree that that's a pretty ideal way to go forward.

Ευχαριστώ.

Thank you.

(Χειροκρότημα)

(Applause)

Ευχαριστώ.

Thank you.

(Χειροκρότημα)

(Applause)

Jennifer Golbeck: Your social media "likes" expose more than you think

Jennifer Golbeck: Your social media "likes" expose more than you think

Related talks

Del Harvey: Protecting Twitter users (sometimes from themselves)

Johanna Blakley: Social media and the end of gender

Juan Enriquez: Your online life, permanent as a tattoo

Susan Etlinger: What do we do with all this big data?

Tamas Kocsis: The case for a decentralized internet

Zeynep Tufekci: We're building a dystopia just to make people click on ads

Related talks

Del Harvey: Protecting Twitter users (sometimes from themselves)

Johanna Blakley: Social media and the end of gender

Juan Enriquez: Your online life, permanent as a tattoo

Susan Etlinger: What do we do with all this big data?

Tamas Kocsis: The case for a decentralized internet

Zeynep Tufekci: We're building a dystopia just to make people click on ads