Rébecca Kleinberger: Why you don't like the sound of your own voice

If you ask evolutionary biologists when did humans become humans, some of them will say that, well, at some point we started standing on our feet, became biped and became the masters of our environment. Others will say that because our brain started growing much bigger, that we were able to have much more complex cognitive processes. And others might argue that it's because we developed language that allowed us to evolve as a species. Interestingly, those three phenomena are all connected. We are not sure how or in which order, but they are all linked with the change of shape of a little bone in the back of your neck that changed the angle between our head and our body. That means we were able to stand upright but also for our brain to evolve in the back and for our voice box to grow from seven centimeters for primates to 11 and up to 17 centimetres for humans.

Om man frågar evolutionsbiologer när människan blev människa svarar en del av dem när vi började stå på våra fötter blev tvåbenta och blev härskare över vår omgivning. Andra säger att det var för att vår hjärna började växa sig större som vi fick förmågan till mycket mer komplexa kognitiva processer. Och ytterligare andra hävdar att det är för att vi utvecklade språk som vi kunde utvecklas som art. Intressant nog så är alla tre fenomenen sammankopplade. Vi vet inte hur eller i vilken ordning men de är alla länkade till förändringen av ett litet ben i nacken som förändrade vinkeln mellan huvudet och kroppen. Detta gjorde att vi kunde stå upprätt men också att hjärnan kunde utvecklas i bakhuvudet och att vårt struphuvud kunde växa från sju centimeter hos primater till 11-17 centimeter hos människor.

And this is called the descent of the larynx. And the larynx is the site of your voice. When baby humans are born today, their larynx is not descended yet. That only happens at about three months old. So, metaphorically, each of us here has relived the evolution of our whole species. And talking about babies, when you were starting to develop in your mother's womb, the first sensation that you had coming from the outside world, at only three weeks old, when you were about the size of a shrimp, were through the tactile sensation coming from the vibrations of your mother's voice.

Detta kallas struphuvudets nedsänkning. Struphuvudet är där rösten finns. När bebisar föds, har deras struphuvud inte sjunkit ner ännu. Det händer vid tre månaders ålder. Så bildligt talat har var och en av oss återupplevt vår arts hela utveckling. Och på tal om bebisar. När du började utvecklas i din mammas mage så var din första förnimmelse av världen utanför vid tre veckors ålder, när du var stor som en räka känsloförnimmelsen av vibrationerna från din mammas röst.

So, as we can see, the human voice is quite meaningful and important at the level of the species, at the level of the society -- this is how we communicate and create bonds, and at the personal and interpersonal levels -- with our voice, we share much more than words and data, we share basically who we are. And our voice is indistinguishable from how other people see us. It is a mask that we wear in society. But our relationship with our own voice is far from obvious. We rarely use our voice for ourselves; we use it as a gift to give to others. It is how we touch each other. It's a dialectical grooming.

Som vi kan se är den mänskliga rösten alltså ganska meningsfull och viktig på artnivå på samhällsnivå - det är så vi kommunicerar och formar anknytningar. Och på personlig och mellanmänsklig nivå delar vi med oss av mycket mer än ord och information med rösten. Vi delar med oss av vem vi är. Och vår röst kan inte särskiljas från hur andra ser oss. Det är en mask vi bär i samhället. Men vårt förhållande till vår röst är långtifrån självklart. Vi använder sällan rösten för oss själva; den är en gåva vi ger till andra. Det är hur vi berör varandra. Det är dialektal övning.

But what do we think about our own voice? So please raise your hand if you don't like the sound of your voice when you hear it on a recording machine.

Men vad tycker vi om vår egen röst? Räck upp handen om du inte gillar ljudet av din röst när du hör den inspelad.

(Laughter)

(Skratt)

Yeah, thank you, indeed, most people report not liking the sound of their voice recording. So what does that mean? Let's try to understand that in the next 10 minutes. I'm a researcher at the MIT Media Lab, part of the Opera of the Future group, and my research focuses on the relationship people have with their own voice and with the voices of others. I study what we can learn from listening to voices, from the various fields, from neurology to biology, cognitive sciences, linguistics. In our group we create tools and experiences to help people gain a better applied understanding of their voice in order to reduce the biases, to become better listeners, to create more healthy relationships or just to understand themselves better.

Tack. Det är faktiskt så att de flesta uppger att de inte gillar ljudet av sin inspelade röst. Vad betyder det? Låt oss försöka förstå det under de kommande 10 minuterna. Jag är forskare vid MIT Media Lab, en del av Opera of the Future-gruppen. Min forskning fokuserar på förhållandet människor har med sin egen röst och andras röster. Jag undersöker vad vi kan lära oss av att lyssna till röster från olika områden, från neurologi till biologi, kognitiv vetenskap, lingvistik. I vår grupp skapar vi verktyg och upplevelser för att hjälpa människor att få en bättre förståelse för sin röst i syfte att minska fördomar, bli bättre lyssnare, skapa sundare förhållanden eller bara förstå sig själva bättre.

And this really has to come with a holistic approach on the voice. Because, think about all the applications and implications that the voice may have, as we discover more about it. Your voice is a very complex phenomenon. It requires a synchronization of more than 100 muscles in your body. And by listening to the voice, we can understand possible failures of what happens inside. For example: listening to very specific types of turbulences and nonlinearity of the voice can help predict very early stages of Parkinson's, just through a phone call. Listening to the breathlessness of the voice can help detect heart disease. And we also know that the changes of tempo inside individual words is a very good marker of depression.

Det måste göras med en holistisk inställning till rösten. För tänk på alla användningsområden rösten kan få, när vi upptäcker mer om den. Din röst är ett komplext fenomen. Den kräver att fler än 100 muskler i din kropp synkroniseras. Genom att lyssna till rösten kan vi förstå de möjliga fel som kan uppstå inuti. Till exempel: Att lyssna på särskilda sorters turbulens och ickelinjäritet i rösten, kan hjälpa till att upptäcka väldigt tidiga stadier av Parkinsons bara genom ett telefonsamtal. Att lyssna efter luftläckage i rösten kan hjälpa till att upptäcka hjärtsjukdom. Vi vet också att tempoförändringar i enskilda ord är en mycket bra markör för depression.

Your voice is also very linked with your hormone levels. Third parties listening to female voices were able to very accurately place the speaker on their menstrual cycle. Just with acoustic information. And now with technology listening to us all the time, Alexa from Amazon Echo might be able to predict if you're pregnant even before you know it. So think about --

Din röst är också väldigt kopplad till dina hormonnivåer. Utomstående som lyssnade på kvinnoröster kunde väldigt träffsäkert placera talaren i deras menstruationscykel. Utifrån akustisk information. Och med tekniken som lyssnar på oss hela tiden, kan kanske Alexa från Amazon Echo förutsäga att du är gravid innan du själv vet om det. Så tänk på ...

(Laughter)

(Skratt)

Think about the ethical implications of that. Your voice is also very linked to how you create relationships. You have a different voice for every person you talk to. If I take a little snippet of your voice and I analyze it, I can know whether you're talking to your mother, to your brother, your friend or your boss. We can also use, as a predictor, the vocal posture. Meaning, how you decide to place your voice when you talk to someone. And you vocal posture, when you talk to your spouse, can help predict not only if, but also when you will divorce.

Tänk på de etiska följderna av det. Din röst är också nära kopplad till hur du skapar relationer. Du har olika röst för varje person du pratar med. Om jag tar en snutt av din röst och analyserar den så kan jag veta om du pratar med din mamma, din bror, din vän eller din chef. Vi kan också använda röstens hållning som prediktor. Alltså, hur du placerar din röst när du pratar med någon. Din rösthållning när du pratar med din partner, kan förutsäga, inte bara om, utan när ni kommer att skiljas.

So there is a lot to learn from listening to voices. And I believe this has to start with understanding that we have more than one voice. So, I'm going to talk about three voices that most of us posses, in a model of what I call the mask. So when you look at the mask, what you see is a projection of a character. Let's call that your outward voice. This is also the most classic way to think about the voice, it's a way of projecting yourself in the world. The mechanism for this projection is well understood. Your lungs contract your diaphragm and that creates a self-sustained vibration of your vocal fold, that creates a sound. And then the way you open and close the cavities in you mouth, your vocal tract is going to transform the sound.

Det finns mycket att lära av att lyssna till röster. Jag tror att det måste börja med förståelse för att vi har fler än en röst. Jag ska prata om tre röster som de flesta av oss har enligt en modell som jag kallar masken. När man tittar på masken ser man en projektion av en person. Vi kallar det din yttre röst. Det är också det mest klassiska sättet att se på rösten. Som ett sätt att presentera sig själv i världen. Mekanismen bakom detta är väl förstådd. Lungorna pressar ihop diafragman, det skapar en vibration i stämbanden vilket skapar ett ljud. Beroende på hur du öppnar och stänger munhålan så förändrar ansatsröret ljudet.

So everyone has the same mechanism. But voices are quite unique. It's because very subtle differences in size, physiology, in hormone levels are going to make very subtle differences in your outward voice. And your brain is very good at picking up those subtle differences from other people's outward voices. In our lab, we are working on teaching machines to understand those subtle differences. And we use deep learning to create a real-time speaker identification system to help raise awareness on the use of the shared vocal space -- so who talks and who never talks during meetings -- to increase group intelligence.

Alla har samma mekanism. Men alla röster är unika. Det beror på att små skillnader i storlek, fysiologi och hormonnivåer skapar små skillnader i din yttre röst. Och din hjärna är väldigt bra på att fånga upp de små skillnaderna i andra människors yttre röster. I vårt labb arbetar vi med att lära maskiner förstå de små skillnaderna. Vi använder djup maskininlärning till ett realtidssystem för talidentifiering för att öka medvetenheten om användningen av vårt gemensamma vokala utrymme - vem som pratar och vem som aldrig pratar på möten - för att öka gruppens samlade kunskap.

And one of the difficulties with that is that your voice is also not static. We already said that it changes with every person you talk to but it also changes generally throughout your life. At the beginning and at the end of the journey, male and female voices are very similar. It's very hard to distinguish the voice of a very young girl from the voice of a very young boy. But in between, your voice becomes a marker of your fluid identity. Generally, for male voices there's a big change at puberty. And then for female voices, there is a change at each pregnancy and a big change at menopause. So all of that is the voice other people hear when you talk. So why is it that we're so unfamiliar with it? Why is it that it's not the voice that we hear? So, let's think about it.

En av svårigheterna med det är att rösten inte är statisk. Vi har ju sagt att den förändras beroende på vem du pratar med. Men den förändras också under livets gång. I början och i slutet av resan är manliga och kvinnliga röster väldigt lika. Det är svårt att skilja en väldigt ung flickas röst från en väldigt ung pojkes röst. Men däremellan är rösten en markör för din flytande identitet. Generellt sett sker en stor förändring i mansrösten vid puberteten. För kvinnoröster sker en förändring vid varje graviditet och en stor förändring i klimakteriet. Allt det är den röst andra hör när du pratar. Så varför är vi så obekanta med den? Varför är det inte den röst vi själva hör? Fundera på det.

When you wear a mask, you actually don't see the mask. And when you try to observe it, what you will see is inside of the mask. And that's your inward voice. So to understand why it's different, let's try to understand the mechanism of perception of this inward voice. Because your body has many ways of filtering it differently from the outward voice. So to perceive this voice, it first has to travel to your ears. And your outward voice travels through the air while your inward voice travels through your bones. This is called bone conduction. Because of this, your inward voice is going to sound in a lower register and also more musically harmonical than your outward voice. Once it travels there, it has to access your inner ear. And there's this other mechanism taking place here. It's a mechanical filter, it's a little partition that comes and protects your inner ear each time you produce a sound. So it also reduces what you hear. And then there is a third filter, it's a biological filter. Your cochlea -- it's a part of your inner ear that processes the sound -- is made out of living cells. And those living cells are going to trigger differently according to how often they hear the sound. It's a habituation effect. So because of this, as your voice is the sound you hear the most in your life, you actually hear it less than other sounds.

När du har på en mask, ser du inte själva masken. Och när du försöker titta på den så ser du insidan av masken. Det är din inåtvända röst. För att förstå skillnaden ska vi försöka förstå den inåtvända röstens uppfattningsmekanism. För kroppen har många sätt att filtrera den annorlunda än den yttre rösten. För att uppfatta den här rösten måste den först nå öronen. Din yttre röst färdas genom luften medan din inåtvända röst färdas genom skelettet. Det kallas benledning. På grund av detta, kommer din inåtvända röst att låta lägre och mer musikaliskt ren än din yttre röst. När den har nått dit, kommer den åt ditt inneröra. Där finns en annan mekanism. Det är ett mekaniskt filter, en liten skiljevägg som skyddar innerörat varje gång du gör ett ljud. Den minskar också det du hör. Sedan finns ett tredje filter, ett biologiskt filter. Din hörselsnäcka, en del av innerörat som behandlar ljudet, är gjord av levande celler. De levande cellerna kommer att reagera olika beroende på hur ofta de hör ett ljud. Det är en tillvänjningseffekt. På grund av detta, eftersom din röst är det ljud du hör mest i livet, hör du den mindre än andra ljud.

Finally, we have a fourth filter. It's a neurological filter. Neurologists found out recently that when you open your mouth to create a sound, your own auditory cortex shuts down. So you hear your voice but your brain actually never listens to the sound of your voice. Well, evolutionarily that might make sense, because we know cognitively what we are going to sound like so maybe we don't need to spend energy analyzing the signal. And this is called a corollary discharge and it happens for every motion that your body does. The exact definition of a corollary discharge is a copy of a motor command that is sent by the brain. This copy doesn't create any motion itself but instead is sent to other regions of the brain to inform them of the impending motion. And for the voice, this corollary discharge also has a different name. It is your inner voice.

Slutligen finns ett fjärde filter. Det är ett neurologiskt filter. Neurologer upptäckte nyligen att när man öppnar munnen för att producera ett ljud, så stänger hörselbarken ner. Du hör din egen röst men hjärnan lyssnar egentligen inte på hur rösten låter. Evolutionsmässigt är det vettigt, för vi vet ju kognitivt hur vi låter. Då kanske vi inte behöver lägga energi på att analysera den signalen. Detta kallas för en efferent kopia och sker för varje rörelse kroppen gör. Den exakta definitionen av en efferent kopia är en kopia av ett rörelsekommando som skickas ut av hjärnan. Kopian skapar ingen rörelse men skickas till andra delar av hjärnan för att informera dem om den kommande rörelsen. När det gäller rösten, har den efferenta kopian ett annat namn. Det är din inre röst.

So let's recapitulate. We have the mask, the outward voice, the inside of the mask, your inward voice, and then you have your inner voice. And I like to see this one as the puppeteer that holds the strings of the whole system. Your inner voice is the one you hear when you read a text silently, when you rehearse for an important conversation. Sometimes is hard to turn it off, it's really hard to look at the text written in your native language, without having this inner voice read it. It's also the voice that refuse to stop singing the stupid song you have in your head.

Låt oss sammanfatta. Vi har masken, den yttre rösten insidan av masken, den inåtvända rösten och så finns den inre rösten. Jag ser den som marionettspelaren som håller i trådarna för hela systemet. Din inre röst är den du hör när du läser en text för dig själv, när du övar inför ett viktigt samtal. Ibland är den svår att stänga av. Det är jättesvårt att se en text skriven på ditt modersmål och inte höra den inre rösten läsa den. Det är också den röst som vägrar sluta sjunga den dumma sången du har i huvudet.

(Laughter)

(Skratt)

And for some people it's actually impossible to control it. And that's the case of schizophrenic patients, who have auditory hallucinations. Who can't distinguish at all between voices coming from inside and outside their head. So in our lab, we are also working on small devices to help those people make those distinctions and know if a voice is internal or external.

Och för en del människor är det faktiskt omöjligt att kontrollera den. Så är det för patienter med schizofreni som har hörselhallucinationer. De kan inte särskilja mellan röster som kommer inifrån eller utifrån. Så vi jobbar också med små apparater för att hjälpa dem göra den särskiljningen och veta om en röst är intern eller extern.

You can also think about the inner voice as the voice that speaks in your dream. This inner voice can take many forms. And in your dreams, you actually unleash the potential of this inner voice. That's another work we are doing in our lab: trying to access this inner voice in dreams. So even if you can't always control it, the inner voice -- you can always engage with it through dialogue, through inner dialogues. And you can even see this inner voice as the missing link between thought and actions.

Man kan också se den inre rösten som den röst som talar i dina drömmar. Den inre rösten kan ha många skepnader. I dina drömmar frigör du den inre röstens kraft. Det är en annat grej vi gör i vårt labb: Vi försöker nå den inre rösten i drömmar. Så även om du inte alltid kan styra den, den inre rösten, så kan du alltid umgås med den genom dialog, inre dialog. Man kan till och med se den inre rösten som den felande länken mellan tanke och handling.

So I hope I've left you with a better appreciation, a new appreciation of all of your voices and the role it plays inside and outside of you -- as your voice is a very critical determinant of what makes you humans and of how you interact with the world.

Jag hoppas jag har gett dig en ökad förståelse och uppskattning för alla dina röster och den betydelse de har inom och utanför dig eftersom din röst är en avgörande del av det som gör dig mänsklig och hur du interagerar med världen.

Thank you.

Tack.

(Applause)

(Applåder)

But what do we think about our own voice? So please raise your hand if you don't like the sound of your voice when you hear it on a recording machine.

Men vad tycker vi om vår egen röst? Räck upp handen om du inte gillar ljudet av din röst när du hör den inspelad.

(Laughter)

(Skratt)

(Laughter)

(Skratt)

(Laughter)

(Skratt)

Thank you.

Tack.

(Applause)

(Applåder)

Rébecca Kleinberger: Why you don't like the sound of your own voice

Rébecca Kleinberger: Why you don't like the sound of your own voice

Related talks

Max Little: A test for Parkinson's with a phone call

Rupal Patel: Synthetic voices, as unique as fingerprints

Annie Murphy Paul: What we learn before we're born

Shaylin Schundler: Why does your voice change as you get older?

Eleanor Longden: The voices in my head

Beardyman: The polyphonic me

Related talks

Max Little: A test for Parkinson's with a phone call

Rupal Patel: Synthetic voices, as unique as fingerprints

Annie Murphy Paul: What we learn before we're born

Shaylin Schundler: Why does your voice change as you get older?

Eleanor Longden: The voices in my head

Beardyman: The polyphonic me