Blaise Agüera y Arcas: How PhotoSynth can connect the world's images

What I'm going to show you first, as quickly as I can, is some foundational work, some new technology that we brought to Microsoft as part of an acquisition almost exactly a year ago. This is Seadragon, and it's an environment in which you can either locally or remotely interact with vast amounts of visual data.

Nitakachowaonyesha kwanza, haraka niwezavyo, ni kazi ya msingi, katika teknolojia mpya ambayo tuliingiza Microsoft kama sehemu ya ununuzi wa kampuni takriban mwaka mmoja uliopita. Hii ni Seadragon. Na ni mfumo ambao unaweza, kwa karibu au mbali, kujiunganisha na kufanyia kazi takwimu mbalimbali za picha.

We're looking at many, many gigabytes of digital photos here and kind of seamlessly and continuously zooming in, panning through it, rearranging it in any way we want. And it doesn't matter how much information we're looking at, how big these collections are or how big the images are. Most of them are ordinary digital camera photos, but this one, for example, is a scan from the Library of Congress, and it's in the 300 megapixel range. It doesn't make any difference because the only thing that ought to limit the performance of a system like this one is the number of pixels on your screen at any given moment. It's also very flexible architecture. This is an entire book, so this is an example of non-image data. This is "Bleak House" by Dickens. Every column is a chapter. To prove to you that it's really text, and not an image, we can do something like so, to really show that this is a real representation of the text; it's not a picture. Maybe this is an artificial way to read an e-book. I wouldn't recommend it.

Hapa tunaziangalia nyingi, katika kipimo cha picha cha gigabyte na bila kukatika na kwa kuendelea kukuza mfululizo, kulengesha kwenye kitu, kuirekebisha vile tutakavyo. Bila kujali taarifa ngapi tunaziangalia, zina ukubwa gani au nyingi kiasi gani. Nyingi kati yake ni picha za kawaida zilizopigwa na kamera za dijito, hii hapa, kwa mfano, ni kivuli cha picha kutoka Maktaba ya Bunge, na iko katika kipimo cha vipandepicha 300. Haileti tofauti yeyote kwasababu kitu pekee kitakachoweza kuzuia ufanisi wa mfumo kama huu ni idadi ya vipandepicha kwenye skrini yako wakati wowote. Huu ni usanifu huria. Hiki ni kitabu kizima, mfano wa takwimu ambazo si picha. Hiki ni Bleak House kilichoandikwa na Dickens. Kila safu ni sura. Kuwathibitishia kwamba haya ni maandishi, na siyo picha, tunaweza kufanya kama hivi, ili kuweza kuonyesha kuwa hiki ni kielelezo cha maandishi: na siyo picha. Labda hii ni njia nyingine ya kusoma kitabu cha nakala za elektroniki. Siwezi kuipendekeza.

This is a more realistic case, an issue of The Guardian. Every large image is the beginning of a section. And this really gives you the joy and the good experience of reading the real paper version of a magazine or a newspaper, which is an inherently multi-scale kind of medium. We've done something with the corner of this particular issue of The Guardian. We've made up a fake ad that's very high resolution -- much higher than in an ordinary ad -- and we've embedded extra content. If you want to see the features of this car, you can see it here. Or other models, or even technical specifications. And this really gets at some of these ideas about really doing away with those limits on screen real estate. We hope that this means no more pop-ups and other rubbish like that -- shouldn't be necessary.

Huu ni mfano wa ukweli. Hili ni toleo la The Guardian. Kila picha kubwa ni mwanzo wa sura. Hii inakupa raha na uzoefu mzuri wa kusoma tolea halisi la jarida au gazeti, ambalo mara nyingi chombo cha habari chenye kina na mapana. Pia tumefanya kitu kidogo kwenye kona ya toleo hili la The Guardian. Tumetengeneza tangazo la uongo na ambalo liko katika kiwango cha juu sana -- kuliko ambavyo ungeweza kuona kwenye tangazo la kawaida -- na tumeongezea vitu vya ziada. Kama unataka kujua taarifa za undani wa gari hili, unaweza kuziona hapa. Au miundo mingine, au hata maelezo ya kina ya kiufundi. Hii inaingia katika baadhi ya haya mawazo katika kuondokana na vikwazo vya ufanisi wa skrini Tunatumaini kwamba hii ina maana kwamba hakutakuwa na vipeperushitovuti tena na taka nyingine kama hizo -- hazitakuwa muhimu.

Of course, mapping is one of those obvious applications for a technology like this. And this one I really won't spend any time on, except to say that we have things to contribute to this field as well. But those are all the roads in the U.S. superimposed on top of a NASA geospatial image. So let's pull up, now, something else. This is actually live on the Web now; you can go check it out.

hakika, ramani zitakuwa moja ya matumizi muhimu ya teknolojia kama hii Na sitapoteza muda kwenye hili, isipokuwa kwamba tuna vitu vya kuchangia katika eneo hili pia. Lakini hizi zote ni barabara za Marekani zilizowekwa juu ya picha za kijiografia za NASA Sasa hebu tuangalie kitu kingine. Hii ipo hewani kwenye mtando kwa sasa; unaweza kwenda na kuiangalia.

This is a project called Photosynth, which marries two different technologies. One of them is Seadragon and the other is some very beautiful computer-vision research done by Noah Snavely, a graduate student at the University of Washington, co-advised by Steve Seitz at U.W. and Rick Szeliski at Microsoft Research. A very nice collaboration. And so this is live on the Web. It's powered by Seadragon. You can see that when we do these sorts of views, where we can dive through images and have this kind of multi-resolution experience.

Huu ni mradi unaoitwa Photosynth, ambao unajumuisha teknolojia mbili tofauti. Mojawapo ni Seadragon na nyingine ni ya utafiti wa kuona katika kompyuta uliofanywa na Noah Snavely, mwanafunzi wa chuo kikuu cha Washington, na kushauriwa na Steve Seitz hapo UW na Rick Szeliski katika kitengo cha utafiti cha Microsoft. Ushirikiano mzuri sana. Kwa hiyo hii iko hewani kwenye mtandao. Na imewezeshwa na Seadragon. Unaweza kuona wakati tukifanya vielelezo hivi, ambapo tunaweza kuzamia kwenye picha na kuwa na aina hii ya kuweza kuona taswira mbalimbali.

But the spatial arrangement of the images here is actually meaningful. The computer vision algorithms have registered these images together so that they correspond to the real space in which these shots -- all taken near Grassi Lakes in the Canadian Rockies -- all these shots were taken. So you see elements here of stabilized slide-show or panoramic imaging, and these things have all been related spatially. I'm not sure if I have time to show you any other environments. Some are much more spatial. I would like to jump straight to one of Noah's original data-sets -- this is from an early prototype that we first got working this summer -- to show you what I think is really the punch line behind the Photosynth technology, It's not necessarily so apparent from looking at the environments we've put up on the website. We had to worry about the lawyers and so on.

Lakini hapa mpangalio wa mahusiano ya picha unaleta maana zaidi. Miundonamba ya picha za kompyuta imezisajiri hizi picha pamoja, ili ziendane na sehemu halisi ambako picha hizi zilipigwa -- zote zilipigwa karibu na Ziwa Grassi huko Canadian Rockies -- zilichukuliwa. Kwa hiyo unaona vipengee hapa za vielelezopicha vilivyokamilika au picha za kupita. na vitu hivi vyote vimehusianishwa pamoja. Sina uhakika kama nina muda wa kuwaonyesha taswira nyingine. Kunamengine ambayo yanahusiana zaidi. Nitaenda moja kwa moja kwenye moja ya seti za takwimu halisi za Noah -- na hii inatoka kwenye toleo la mfano la Photosynth ya awali ambayo tuliipata wakati tukifanya kazi majira ya joto -- kukuonyesha ninachokifikiria ni mzaha tu wa teknolojia hii, teknolojia ya Photosynth. Na si dhahiri sana kwa kuangalia katika mfumo tuliouweka kwenye tovuti. Ilibidi tuanze kuhofia juu ya wanasheria na mengineyo.

This is a reconstruction of Notre Dame Cathedral that was done entirely computationally from images scraped from Flickr. You just type Notre Dame into Flickr, and you get some pictures of guys in T-shirts, and of the campus and so on. And each of these orange cones represents an image that was discovered to belong to this model. And so these are all Flickr images, and they've all been related spatially in this way. We can just navigate in this very simple way.

Huu ni ujengwaji tena wa kanisa kuu la dayosisi ya Notre Dame ambao ulifanywa kwa kwakutumia kompyuta peke yake kutoka kwenye picha zilizopatikana kwenye Flickr. Unaandika Notre Dame kwenye Flickr, na unapata picha za watu waliovaa T-shirts, na za eneo la chuo na mengineyo. Na kati ya kila hizi pia za rangi ya chungwa zinawakilisha taswira ambazo ziligunduliwa zinauhusiano na muundo huu. Na hizi zote ni picha za Flickr, na zote zimehusishwa kwa njia hii. Na tunaweza kutembelea kwa njia hii rahisi.

(Applause)

(Makofi).

(Applause ends)

You know, I never thought that I'd end up working at Microsoft. It's very gratifying to have this kind of reception here.

Unajua, sikufikiria kuwa nitakuja kufanya kazi Microsoft. Ni faraja kubwa sana kupata mapokezi kama haya hapa.

(Laughter)

(Kicheko).

I guess you can see this is lots of different types of cameras: it's everything from cell-phone cameras to professional SLRs, quite a large number of them, stitched together in this environment. If I can find some of the sort of weird ones -- So many of them are occluded by faces, and so on. Somewhere in here there is actually a series of photographs -- here we go. This is actually a poster of Notre Dame that registered correctly. We can dive in from the poster to a physical view of this environment.

Natumaini mnaweza kuona hizi ni kamera nyingi tofauti: ni kila kitu kutoka kwenye kamera za simu za mkononi mpaka kamera za kitaalam za SLRs, ni nyingi, zikiwa pamoja katika mfumo huu. Na kama nitaweza, nitatafuta zile za ajabu. Nyingi zao zimezibwa kwa sura za watu, na mengineyo Kati ya hapo kuna mlolongo wa picha -- naam hapa. Hii hakika ni picha ya Notre Dame ambayo imesajiliwa kwa usahihi. Tunaweza kuingia ndani ya picha katika mazingira ya maumbile yake.

What the point here really is is that we can do things with the social environment. This is now taking data from everybody -- from the entire collective memory, visually, of what the Earth looks like -- and link all of that together. Those photos become linked, and they make something emergent that's greater than the sum of the parts. You have a model that emerges of the entire Earth. Think of this as the long tail to Stephen Lawler's Virtual Earth work. And this is something that grows in complexity as people use it, and whose benefits become greater to the users as they use it. Their own photos are getting tagged with meta-data that somebody else entered. If somebody bothered to tag all of these saints and say who they all are, then my photo of Notre Dame Cathedral suddenly gets enriched with all of that data, and I can use it as an entry point to dive into that space, into that meta-verse, using everybody else's photos, and do a kind of a cross-modal and cross-user social experience that way. And of course, a by-product of all of that is immensely rich virtual models of every interesting part of the Earth, collected not just from overhead flights and from satellite images and so on, but from the collective memory.

Cha muhimu hapa ni nini tunaweza kufanya na mfumo huu. Hii ni kuchukua takwimu kutoka kwa kila mtu -- kutoka katika mkusanyiko wa kumbukumbu za taswira, namna dunia ilivyo -- na kuzijumuisha zote. Picha zote zinaunganishwa pamoja, na zinafanya kitu kutokea ambacho ni kubwa zaidi ya jumla ya sehemu ndogondogo. Una mfano ambao unatokea katika dunia nzima. Fikiria hii ni kama mkia mrefu wa kazi za picha za dunia za Stephen Lawler. Na hiki kitu ambacho kinakua na kuongeza mchangamano jinsi watu wanavyotumia, na faida yake inakuwa kubwa kwa watumiaji jinsi wanavyotumia. Picha zao zinaunganishwa na meta-data ambavyo mtu mwingine ameviingiza. Kama kuna mtu angewaunganisha watakatifu hawa wote na kusema wao ni akina nani, kwa hiyo picha yangu ya kanisa kuu la Notre Dame ingeboreshwa na vielelezo hivyo vyote, na ninaweza kuitumia kama njia ya kuingia katika sehemu hiyo, katika takwimumaneno, kwa kutumia picha za watu wengine, na kufanya mwingiliano na mwingiliano wa watumiaji kwa njia hiyo. Na kwa hakika, matokeo ya yote hayo ni mifumo thabiti ya picha wa kila sehemu ya dunia, iliyokusanywa siyo tu kwa angani na kutoka kwenye picha za setilaiti na mengineyo, bali kutoka kwenye majumuisho ya kumbukumbu.

Thank you so much.

Asanteni sana.

(Applause)

(Makofi).

(Applause ends)

Chris Anderson: Do I understand this right? What your software is going to allow, is that at some point, really within the next few years, all the pictures that are shared by anyone across the world are going to link together?

Chris Anderson: Nimekuelewa? Kuwa programu yako itaruhusu, kuwa, wakati fulani, katika kipindi cha miaka michache ijayo, picha zote zitakazokuwa zikigawanwa na mtu yeyote duniani zitaunganishwa pamoja?

BAA: Yes. What this is really doing is discovering, creating hyperlinks, if you will, between images. It's doing that based on the content inside the images. And that gets really exciting when you think about the richness of the semantic information a lot of images have. Like when you do a web search for images, you type in phrases, and the text on the web page is carrying a lot of information about what that picture is of. What if that picture links to all of your pictures? The amount of semantic interconnection and richness that comes out of that is really huge. It's a classic network effect.

BAA: Ndiyo. Kinachotokea hapa ni uvumbuzi. Inatengeneza viuongotovuti, kati ya picha. Na inafanya hivyo ikitegemea yaliyomo ndani ya picha. Hii inaleta msisimko zaidi ukifikiria kuhusu ubora wa taarifa zilizomo kwenye picha hizo. Kwa mfano ukiwa unatafuta picha kwenye mtandao, unaandika vifungu vya maneno na maandishi katika ukurasa wa tovuti inabeba taarifa kuhusu picha hiyo ni ya nini. Sasa, itakuwaje kama picha hiyo inaunganisha picha zako zote? Hapo idadi ya miunganiko ya taarifa na idadi ya ubora ambao unakuja pamoja nayo ni kubwa sana. Ni matokeo ya kiwango cha juu cha muungano wa mtandao. CA: Blaise, hii ni nzuri sana. Hongera.

CA: Truly incredible. Congratulations.

BAA: Asante sana

(Applause)

(Makofi).

(Applause ends)

You know, I never thought that I'd end up working at Microsoft. It's very gratifying to have this kind of reception here.

Unajua, sikufikiria kuwa nitakuja kufanya kazi Microsoft. Ni faraja kubwa sana kupata mapokezi kama haya hapa.

(Laughter)

(Kicheko).

Thank you so much.

Asanteni sana.

(Applause)

(Makofi).

(Applause ends)

CA: Truly incredible. Congratulations.

BAA: Asante sana

Blaise Agüera y Arcas: How PhotoSynth can connect the world's images

Blaise Agüera y Arcas: How PhotoSynth can connect the world's images

Related talks

David Bolinsky: Visualizing the wonder of a living cell

Johnny Lee: Free or cheap Wii Remote hacks

Anand Agarawala: Rethink the desktop with BumpTop

Levon Biss: Mind-blowing, magnified portraits of insects

Christoph Niemann: You are fluent in this language (and don't even know it)

Sarah Sze: How we experience time and memory through art

Related talks

David Bolinsky: Visualizing the wonder of a living cell

Johnny Lee: Free or cheap Wii Remote hacks

Anand Agarawala: Rethink the desktop with BumpTop

Levon Biss: Mind-blowing, magnified portraits of insects

Christoph Niemann: You are fluent in this language (and don't even know it)

Sarah Sze: How we experience time and memory through art