We really need to put the best we have to offer within reach of our children. If we don't do that, we're going to get the generation we deserve. They're going to learn from whatever it is they have around them.
我们的确需要把最好的资讯提供给孩子们,让他们随手可得。 如果我们不那么做,那我们会得到我们应得的一代, 他们将会从周围的一切事物中随意地学习知识。
And we, as now the elite, parents, librarians, professionals, whatever it is, a bunch of our activities are, in fact, in trying to get the best we have to offer within reach of those around us, or as broadly as we can. I'm going to start and end this talk with a couple things that are carved in stone. One is what's on the Boston Public Library. Carved above their door is, "Free to All." It's kind of an inspiring statement, and I'll go back at the end of this. I'm a librarian, and what I'm trying to do is bring all of the works of knowledge to as many people as want to read it. And the idea of using technology is perfect for us. I think we have the opportunity to one-up the Greeks. It's not easy to one-up the Greeks. But with the industriousness of the Egyptians, they were able to build the Library of Alexandria -- the idea of a copy of every book of all the peoples of the world. The problem was you actually had to go to Alexandria to go to it. On the other hand, if you did, then great things happened. I think we can one-up the Greeks and achieve something. And I'm going to try to argue only one point today: that universal access to all knowledge is within our grasp. So if I'm successful, then you'll actually come away thinking, yeah, we could actually achieve the great vision of everything ever published, everything that was ever meant for distribution, available to anybody in the world that's ever wanted to have access to it.
而我们, 社会精英、家长、图书管理员、专业人士、以及其他各界人士, 事实上已经开展了一系列的活动,从而尽可能地提供最好的资讯 让我们身边的人随手可得,并尽量将范围扩大 在这个演讲的开始和结尾, 我会讲述一些刻在石碑上的事情 一块位于波士顿公共图书馆。 图书馆的大门上刻着”一切都是免费的“ 这种说法让人深受启发, 我们会在演讲即将结束时再次回顾这句话。 我是一个图书管理员,我尽力地把所有的知识以及作品 提供给需要阅读它们的人。 运用现代科技是十分理想的。 我认为我们有机会超越希腊人。 想要超越希腊人是不容易的。但依靠埃及人的勤勉, 他们建成了亚历山大图书馆—— 以实现收藏世界上每本书的梦想。 问题是,你需要去亚历山大图书馆才能看到这一切。 另一个方面,如果你真这么做了,那么大事就要发生了。 我认为我们可以比希腊人更胜一筹,去实现某些梦想。 今天我打算只探讨一个观点: 把我们可以获得的所有知识提供给所有人。 如果我成功了,你一定会这样想, 是的,我们的确可以实现将所有已经出版, 或曾经想要出版的知识, 呈现给世界上所有需要他们的人的伟大梦想。
Yes, there's issues about how money should be distributed, and that's still being refigured out. But I'd say there's plenty of money, and there's plenty of demand, so we can actually achieve that. But I'm going to go over the technological, social and sort of where are we as a whole, trying to get to that particular vision. And the way I'm going to try to do this is do it like the Amazon.com website, the books, music, video and just go step -- media type by media type, just go and say, all right, how're we doing on this?
是的,这时出现了资金该如何分配的问题, 我们还在探索怎样解决这个问题 但我要说,这里有足够的资金,也有大量的需求。 所以我们是可以实现它的。 但是我会逐步探讨在技术层面,社会层面 以及目前进行的成果以达到这个目标 我将尽力用类似Amazon网站的方式来完成它 书籍、音乐、影像分门别类,按媒介的种类进行归档 然后,我们选择其中一个类别开始
So if we start with books, you know, sort of where are we? Well, first you have to, as an engineer, scope the problem. How big is it? If you wanted to put all of the published works online so that anybody could have it available, well, how big a problem is it? Well, we don't really know, but the largest print library in the world is the Library of Congress. It's 26 million volumes, 26 million volumes. It is, by far and away, the largest print library in the world. And a book, if you had a book, is about a megabyte, so -- you know, if you had it in Microsoft Word. So a megabyte, 26 million megabytes is 26 terabytes -- it goes mega-, giga-, tera-. 26 terabytes. 26 terabytes fits in a computer system that's about this big, on spinning Linux drives, and it costs about 60,000 dollars. So for the cost of a house -- or around here, a garage -- you can put, you can have spinning all of the words in the Library of Congress. That's pretty neat.
假设我们从书籍开始,就从我们现有的资源着手 首先,作为一个工程师,你必须要衡量一下问题的范围,有多少数量的图书? 我们现在设想的是把所有出版了的作品放在网上 因此任何人都可以随意取阅,那么,这个问题有多大呢? 我们不知道,世界上最大的出版物藏库 就是美国国会图书馆——那里有2600万卷藏书 它是目前世界上最大的出版物图书馆 假设每一本书,大约是1兆字节的容量 并且,这本书是微软Word格式的 一兆一本书,2600万兆字节(MB)就是26太字节(TB) 容量大小单位依次是MB,GB,TB,那里有26TB容量的信息 如果把26TB的数据导入到大约这么大、装有 Linux操作系统的计算机中,需要花费6万美元 因此,只需要一所房子,甚至只是这么大的一个车库 你就可以存储美国国会图书馆中所有书籍的内容 而且这样存放是非常简洁的
Then the question is, what do you get? You know, is it worth trying to get there? Do you actually want it online? Some of the first things that people do is they make book readers that allow you to search inside the books, and that's kind of fun. And you can download these things, and look around them in new and different ways. And you can get at them remotely, if you happen to have a laptop. There's starting to be some of these sort of page turn-y interfaces that look a whole lot like books in certain ways, and you can search them, make little tabs, and it's kind of cute -- still very book-like -- on your laptop. But I don't know, reading things on a laptop -- whenever I pull up my laptop, it always feels like work. I think that's one of the reasons why the Kindle is so great. I don't have to feel like I'm at work to read a Kindle. It's starting to be a little bit more specified. But I have to say that there's older technologies that I tend to like. I like the physical book. And I think we can go and use our technology to go and digitize things, put them on the Net, and then download, print them and bind them, and end up with books again.
但问题是:你得到了什么? 这样做是有价值的吗? 你确实希望把他们放到网上? 一开始人们做的事情是让读者们 可以在书籍里任意搜寻一些资料,这种做法非常有趣 你可以下载这些读物,然后用新的、不同往常的方式去查阅他们 如果你有一台笔记本电脑,那么你就可以进行远程下载 程序中会有一个可以翻页的书籍界面 使得程序看起来就像是一本书 你可以搜寻书籍中的某些内容,设置书签,这会非常简洁有趣 并且跟普通的书籍一样,呈现在你的笔记本电脑中 但是,我觉得,在笔记本电脑上阅读电子书 感觉就像是在工作 也许这就是Kindle之所以如此成功的一个原因 用Kindle阅读电子书不会让你觉得像是在工作 让人感觉就是在阅读 但是,我还是喜欢那些老技术做出来的东西 我喜欢阅读实体书籍 我认为可以用技术将书籍内容数字化 然后发布到网络上,提供给人们下载 之后再将这些内容打印、装订,最终又形成一本书的样子
And we sort of said, well, how hard is this? And it turns out to not be very hard. We actually went off to make a bookmobile. And a bookmobile -- the size of a van with a satellite dish, a printer, binder and cutter, and kids make their own books. It costs about three dollars to download, print and bind a normal, old book. And they actually come out kind of nice looking. You can actually get really good-looking books for on the order of one penny per page, sort of the parts cost for doing this.
话是这么说的,但是,这将会有多难呢? 事实上,一点儿也不难 我们曾经做过一个流动图书馆 流动图书馆,就是一辆小货车(面包车),里面有卫星天线 打印机,粘合剂和切割用具,这样年轻人就可以做出他们自己的书 下载,打印并装订一本普通的书籍需要花费3美元 这样的书看起来非常漂亮 你可以得到这些好看的书 成本也就是一美分一页,当然这只是做这本书所花费成本的一部分
So the idea of -- this technology actually may end up putting books back in people's hands again. There are some other bookmobiles running around. This is Eric Eldred making books at Walden Pond -- Thoreau's works. This is just before he got kicked out by the Parks Services, for competing with the bookstore there. In India, they've got another couple bookmobiles running around. And this is the opening day at the Library of Alexandria, the new Library of Alexandria, in Egypt. It was quite popularly attended. And kids starting to make their own books, and a happy kid with the first book that he's ever owned. So the idea of being able to use this technology to end up with paper where I can handle sort of sounds a little retro, but I think it still has its place. And being from the Silicon Valley, sort of utopian sort of world, we thought, if we can make this technology work in rural Uganda, we might have something. So we actually got some funding from the World Bank to try it out. And we found in about 30 days we could go and take a couple folks from Silicon Valley, fly them to Uganda, buy a car, set up the first Internet connection at the National Library of Uganda, figure out what they wanted, and get a program going making books in rural Uganda. And it actually -- so technologically, it works.
使用这种技术的创意,最终也会把书 传递给读者 现在也存在一些其他人做的流动图书馆 这位是EricEldred,他主要做WaldenPond和Thoreau的作品 他刚刚被从公园服务中剔除 原因就是他和当地的书店竞争 在印度,那里有一对夫妻经营的流动图书馆 这是亚历山大图书馆开业时的照片 新亚历山大图书馆,位于埃及 它的出现受到了人们的欢迎 孩子们开始制作他们自己的图书 这个孩子第一次拥有了属于他自己的图书,他非常开心 这种使用技术来制作那些我能处理的图书的创意 听起来有一些复古 但它依然很有前途 联想到硅谷、乌托邦 甚至是全世界 我认为如果这项技术能在乌干达的乡村中得以实现 就会取得一些超乎想像的收获 我们获得了世界银行的资助,并尽最大力量去将其开发出来 在30天里,我们从硅谷出发 飞往乌干达,并在那里购买了一辆车, 在乌干达的国家图书馆里搭建了第一个互联网链接,找出他们想要的的东西 并准备了一个在乌干达乡村用于制作图书的程序 从技术角度来讲,这一切都实现了
What we found out of this is we didn't have the right books. So the books were in the library. We could get it to people, if they're digitized, but we didn't know how to quite get them digitized. Everybody thought the answer is, send things to India and China. And so we've tried that, and I'll go over that in a moment. There are some newer technologies for delivering that have happened that are actually quite exciting as well. One is a print-on-demand machine that looks like a Rube Goldberg machine. We have one of these things now. It's completely cool. It's all conveyor belt, and it makes a book. And it's called the "Espresso Book Machine," and in about 10 minutes, you can press a button and make a book.
最终,令我们遗憾的是,我们无法得到最合适的书 书都在图书馆里。如果书籍是数字化的,我们就可以把它做出来送给读者 但我们不知道如何将他们数字化 所有人会想到一个主意,把这件事交给印度和中国 我们也正在尝试着,一会儿我将会在这件事情上稍作阐述 现如今,人们发明了一些新的传送技术 这些技术听起来也让人非常兴奋 其中一个就是像鲁布·戈德堡机器的按需打印机 我们现在已经拥有其中的一些设备,这是很酷的 传输带到处都是,而且它可以用来制作图书 这个机器被称作"Espresso Book Machine"(Espresso图书制作机) 你只需要轻轻按下一个按钮,10分钟后,它就会制作出一本书来
Something else I'm quite excited about in this particular domain, beyond these sort of kiosk-y things where you can get books on demand, is some of these new little screens that are coming out. And one of my favorites in this is the $100 laptop. And I don't mean to steal any thunder here, but we've gone and used one of these things to be an e-book reader. So here's one of the beta units and you can -- it actually turns out to be a really good-looking e-book reader. And we have a quick hack that we did to try to put one of our books on it, and it turns out that 200 dots per inch means that you can put scanned books on them that look really good. At 200 dots per inch, it's kind of the equivalent of a 300 dot print laser printer. We're in good enough shape. You actually can go and read scanned books quite easily.
我之所以钟情于这个特殊的领域 不仅仅是这些类似便利店的流动图书馆可以制作任何你想要的书籍 而是这背后即将实现的一个伟大场景 有一种100美元的便携式电脑是我的最爱之一 我并不想在这里博取任何轰动 但我们正在把这种便携式电脑做成电子书阅读器 这是其中一款测试机型 它看起来是一个不错的电子书阅读器 我们正在上面写入第一本图书 它的分辨率达到了每英寸200点 这就意味着你可以把扫描过的图书放在上面,并且看起来非常舒服 每英寸200点的分辨率(200dpi)等价于一台300点激光打印机的打印效果 并且它的造型也不错 大家可以通过它非常方便的去阅读那些被扫描过的图书
So the idea of electronic books is starting to come about. But how do you go about doing all this scanning? So we thought, okay, well, let's try out this send books to India thing. And there was a project with, funded by the National Science Foundation -- sent a bunch of scanners, and the American libraries were supposed to send books. Well, they didn't. They didn't want to send their books. So we bought 100,000 books and sent them to India. And then we learned why you don't want to send books to India. The lesson we learned out of this is, scan your own books. If you really care about books, you're going to scan them better, especially if they're valuable books. If they're new books and you can just, you know, butcher them, because you could just buy another one, that's not such a big deal in terms of doing high-quality scanning. But do things that you love. But the Indians have been scanning a lot of their own books -- about 300,000 now -- doing very well. The Chinese did over a million, and the Egyptians are about 30,000.
电子书的想法就要产生了 但,你怎么去做那些扫描工作? 我们经过思考,好吧,我们把书送到印度去进行扫描 这里有一个项目,它是由美国国家科学基金会出资支持的,购买了 一些扫描仪用于扫描,美国的图书馆理所当然的要将书送出去扫描 但是,他们没有,他们根本就不想把他们自己的书送出去 于是,我们购买了100000本书,将他们送到了印度 接下来,我们就发现了为什么人家图书馆不把书送到印度的原因 我们得到的教训就是,扫描你自己的图书 如果你非常喜欢图书,你就会更加仔细认真的扫描 特别是那些书非常有价值的时候 如果是新书,那就不同了,你可以“残害”他们 因为你可以再买一本新的 对于高质量的扫描来说这不算什么大事 但,一定要去做那些你喜欢做的事 印度人已经高质量的完成了他们自己书籍的扫描 大约有300000本左右 中国人大约完成了有一百万多,埃及人仅仅是30000本的样子
But we sent -- thought, OK, if we're going to need to do this, let's do it in-library. How do we go and do this, and how do we get it down so that it's a cost point that we could afford? And we sort of picked the price point of 10 cents a page. If it's basically the cost of xeroxing to basically digitize, OCR, package it up, make it so that you could download, print and bind it -- the whole shebang -- we would have achieved something. So we started out trying to figure out. How do we get to 10 cents? And we tried these robot things, and they worked pretty well -- sort of these auto-page-turning things. If we can have Mars Rovers, you'd think you could turn pages. But it actually turns out to be pretty hard to turn pages, and the volume isn't there. So anyway -- so we ended up making our own book scanner, and with two digital, high-grade, professional digital cameras, controlled museum lighting, so even if it's a black and white book, you can go and get the proper intonation. So you basically do a beautiful, respectful job. This is not a fax, this is -- the idea is to do a beautiful job as you're going through these libraries. And we've been able to achieve 10 cents a page if we run things in volume. This is what it looks like at the University of Toronto. And actually, it turns out to, you know, pay a living wage. People seem to love it. Yes, it's a little boring, but some people kind of get into the Zen of it. (Laughter) And especially if it's kind of interesting books that you care about, in languages that you can read. We actually have been able to do a pretty good job of this, at getting 10 cents a page. So 10 cents a page, 300 pages in your average book, 30 dollars a book. The Library of Congress, if you did the whole darn thing -- 26 million books -- is about 750 million dollars, right? But a million books, I think, actually would be a pretty good start, and that would cost 30 million dollars. That's not that big a bill.
我们已经送去了这么多,如果我们还需要继续做下去,还是选择在图书馆里吧 我们该如何去做,如何开展工作 我们是否负担得起经费? 我们给出的参考价格是每页10美分 这是静电复印,光字符识别,打包 提供下载,打印及装订等所有事情的总成本 我们可以有所成就 我们试图给出是如何计算出每页10美分这个数据的 我们使用了机器人之类的设备,而且他们工作的相当不错 即类似于自动翻页的东西 如果我们能制造火星漫游号,也就能制造这样的翻页机 但是要想做到翻页还是很困难的,所以完成的数量还是不够大 不论如何,我们制作了属于自己的图书扫描器 用两个优质、专业的数码摄像头 结合着可控的博物馆照明——即便那是一本没有色彩的书 你可以制作出一本不错的书来 你做的将是一件美妙,而且是令人敬仰的工作 这不是传真,这是一个美妙工作的创意 只需要你穿过那些图书馆就可以实现 如果我们能大量进行,我们就可以实现每页10美分的成本 多伦多大学正在做的就是这样的事情 事实上,这个成本可以让很多人承受得了 人们非常喜欢这么去做 是的,有些无趣,但是有些人已进入禅定 (笑声) 特别是当这本趣书是你喜欢的 并且书中的语言也是你能看懂的 我们已经有办法做到控制在每页10分美金 每页10美分,平均每本书300页,也就是一本书30美元 美国国会图书馆,如果你做过调查 有2600万本书——花费在7亿5千万美元,对吧? 但是一百万册图书 - 是个相当不错的开始 这需要花费3千万美元。这样花费就没有那么多了
And what we've been able to do is get into libraries. We've now got eight of these scanning centers in three countries, and libraries are up for having their books scanned. The Getty here is moving their books to the UCLA, which is where we have one these scanning centers, and scanning their out-of-copyright books, which is fabulous. So we're starting to get the institutional responsibility. The thing we're missing is the 10 cents. If we can get the 10 cents, all the rest of it flows. We've scanned about 200,000 books. Now we're scanning about 15,000 books a month, and it's starting to gear up another factor of two from there.
现在,我们需要做的就是进入到图书馆中 我们已经在三个国家建立了8个这样的扫描中心 有些图书馆期望扫描他们的馆藏 这里的Getty正准备把他们的书籍送到加利福尼亚大学洛杉矶分校 那里有我们的一个扫描中心 去扫描他们出版的书籍,这是难以置信的 我们已经开始得到机构的信任了 我们缺少的东西是10美分 如果我们能实现每页10美分,所有事情就变得很简单了 我们已经扫描了20万本书 现在,我们可以每月扫描1万5千本书 这就组成二要素中的另一个要素
So all in all, that's going very well. And we're starting to move out of the just out-of-copyright into the out-of-print world. So I think of -- we're kind of going from the out-of-copyright, library stuff, and Amazon.com is coming from the in-print world. And I think we'll meet in the middle some place, and have the classic thing that you have, which is a publishing system and a library system working in parallel. And so we're starting up a program to do out-of-print works, but loaning them. Exactly what loaning means, I'm not quite sure. But anyway, loaning out-of-print works from the Boston Public Library, the Woods Hole Oceanographic Institute and a few other libraries that are starting to participate in this program, to try out this model of where does a library stop and where does the bookstore take over. So all in all, it's possible to do this in large scale. We're also going back over microfilm and getting that online. So, we can do 10 cents a page, we're going 15,000 books a month and we've got about 250,000 books online, counting all the other projects that are starting to add in. So what I wanted to argue is, books are within our grasp. The idea of taking on the whole ball of wax is not that big a deal. Yes, it costs tens of millions, low hundreds of millions, but one time shot and we've got basically the history of printed literature online. And then, there's business model issues about how to try to effectively market it and get it to people. But it is within our grasp, technologically and law-wise, at least for the out of print and out of copyright, we suggest, to be able to get the whole darn thing online.
总起来说,所有事情进展得很顺利 我们正计划着将这个范围从出版物 扩大到所有的印刷品 我设想的是——现在我们做的只是图书馆中的出版物 亚马逊的来源是有版权的图书 我们会在某个中间点会合 以传统的方法运作 就像是出版系统和图书馆系统平行运作 我们发展一套计划来扫描绝版书,不过是借来的 具体借阅的方式,我也不是很清楚 但是无论如何,我们从波士顿公共图书馆、 伍兹霍尔海洋学研究所以及几个别的图书馆中借得了那些绝版 这些图书馆均都加入到了这个计划当中 目的就是试验这个模式:让图书馆关门,同时 让书店取得主导地位 各方面来说,大型的合作是可行的 我们也正在准备微缩胶卷,并把它们放在网上 现在的情况是每页成本10美分,每月扫描有1万5千本书 已经有25万本书放到了网上 这是综合了所有其他后来加进来的项目得出的数据 我想说的是,我们已经把书籍这个栏目紧紧抓在了手中 把这整个计划进行到底也没什么大不了的(一定能实现) 需要千万美金的资金,一,二亿美金的资金 经过一次努力,我们基本上就可以把所有历史上的印刷品放到网上 这里有一些商业模式问题 主要是关于如何进行有效的市场定位,并把这些作品传递给客户 不过,从技术角度和法律角度来看这一切尽在掌握之中 至少绝版书以及超过版权保护期限的 我们建议要全面的数位化
Now let's go for audio, and I'm going to go through these. So how much is there? Well, as best we can tell, there are about two to three million disks having been published -- so 78s, long-playing records and CDs -- or at least that's the largest archives of published materials we've been able to sort of point at. It costs about 10 dollars a piece to go and take a disk and put it online, if you're doing things in volume. But we've found that the rights issues are really quite thorny. This is a fairly heavily litigated area, so we've found that there are niches in the music world that aren't served terribly well by the classic commercial publishing system. And we've been starting to make these available by going and offering shelf space on the Net. In the United States, it doesn't cost you to give something away. Right? If you give something to a charity or to the public, you get a pat on the back and a tax donation -- except on the Net, where you can go broke. If you put up a video of your garage band, and it starts getting heavily accessed, you can lose your guitars or your house.
接下来,我们来看音频文件,我将要讲的是这个方面的内容 有多少音频文件呢? 据我们了解,意发行的唱片大约有二,三百万张 重刻盘,黑胶唱片,光碟 是已发行资料的最大资料库 我们可以从这个方向着手 一张唱片数位化需要10元美金 这是大量数位化的费用 我们发现关于版权的问题是很棘手的 这个领域充斥着大量的诉讼问题 我们在音乐领域发现了一些利基 它们还没有被传统的商业出版系统掌控 我们通过向他们提供网上的货架空间 将这一切变得可行 在美国,给出某些东西是不需要花费任何成本的,对吧? 如果你捐赠一些东西给慈善机构或者是将之公之于众 你会得到别人的称赞或是可以得到退税 网路世界例外,你会因而破产 如果将你的的影片配上音乐放到网路,让影片广泛流传 你甚至会因此失去你的吉他,或者是你的房子
This doesn't make any sense. So we've offered unlimited storage, unlimited bandwidth, forever, for free, to anybody that has something to share that belongs in a library. And we've been getting a lot of takers. One is the rock 'n' rollers. The rock 'n' rollers had a tradition of sharing, as long as nobody made any money. You could -- concert recordings, it's not the commercial recordings, but concert recordings, started by the Grateful Dead. And we get about two or three bands a day signing up. They give permission, and we get about 40 or 50 concerts a day. We have about 40,000 concerts, everything the Grateful Dead ever did, up on the Net, so that people can see it and listen to this material. So audio is possible to put up, but the rights issues are really pretty thorny. We've got a lot of collections now -- a couple hundred thousand items -- and it's growing over time.
这是没有任何道理的 因此,我们面向所有人免费提供了无限制的存储空间,无限制的宽带 让所有人分享属于图书馆的资源 我们有很多的接受者。其中一个就是摇滚乐队的成员们 他们有着分享的传统 只要你不因此获利。你可以 音乐会的录音,不是商业用途 只是演唱会的现场录音,这是由“Grateful Deal”乐队率先开始的 现在平均每天有2到3个乐队注册加入 他们开放授权,这样平均每天我们就有40到50场音乐会 我们现在有大约4万场音乐会,包括GratefulDeal乐队曾经演出过的 发布到了网上,这样人们就可以看到并可以去听这些演唱会 因此,音频文件是可行的,但是就是版权问题太棘手 我们现在已经收集了很多资料 有上万个条目,并且它还在一直增长
Moving images: if you think of theatrical releases, there are not that many of them. As best we can tell, there are about 150,000 to 200,000 movies ever that are really meant for a large-scale theatrical distribution. It's just not that many. But half of those were Indian. But anyway, it's doable, but we've only found about a thousand of these things that -- to be out of copyright. So we've digitized those and made those available. But we've found that there's lots of other types of movies that haven't really seen the light of day -- archival films. We've found, also, a lot of political films, a lot of amateur films, all sorts of things that are basically needing a home, a permanent home. So we've been starting to make these available and it's grown to be very popular. We're not quite a YouTube. We tended towards longer-term things and also things that people can reuse and make into new movies, which has just been great fun.
影片:如果你想到的商业电影 商业电影并不多 据我们所知最多也大约只有15万到20万部电影 他们是商业发行电影。只是数量并不多 其中的一半来自印度 无论如何,这是可行的 我们仅仅发现了其中的一千部左右 是不受版权保护的 我们将其数字化处理,并发布到网上 同时,我们发现了很多其他类型的电影 还有一些尚未发行的档案电影(纪录片) 我们找到了很多政治电影,业余电影 各类电影基本上需要一个家,一个永久的家 因此我们已开始让他们流通,受到了欢迎 我们不是在模仿YouTube 我们更偏向于有一定长度的影像资料 人们可以重新编辑,并可以添加到新电影中 这样做会非常有趣
Television comes quite a bit larger. We started recording 20 channels of television 24 hours a day. It's sort of the biggest TiVo box you've ever seen. It's about a petabyte, so far, of worldwide television -- Russian, Chinese, Japanese, Iraqi, Al Jazeera, BBC, CNN, ABC, CBS, NBC -- 24 hours a day. We only put one week up, which is mostly for cost reasons, which is the 9/11, sort of from 9/11/2001. For one week, what did the world see? CNN was saying that Palestinians were dancing in the streets. Were they? Let's look at the Palestinian television and find out. How can we have critical thinking without being able to quote and being able to compare what happened in the past? And television is dreadfully unrecorded and unquotable, except by Jon Stewart, who does a fabulous job. So anyway, television is, I would suggest, within our grasp. So 15 dollars per video hour, and also about 100 dollars to 150 dollars per celluloid hour, we're able to go and get materials online very inexpensively and have them up on the Net. And we've got, now, a lot of these materials. So we've got about 100,000 pieces up there. So books, music, video, software. There's only 50,000 titles of it. Mostly the issues there are legal issues and breaking copy protections. But we've worked through some of those, but we've still got real problems in Washington.
电视资讯的内容很多 我们已经开始了对20个频道一天24小时的记录 这可能是你见到的最大的TiVo盒(TiVo是一种数字录象设备,它能帮助人们非常方便地录下和筛选电视上播放过的节目) 这一切都将是PB级的(1PB=1024TB),因为有来自世界各个地区和国家的电视节目 里面有俄语,汉语,日语,伊拉克语,包括半岛电视台,BBC,CNN,ABC,CBS,NBC等等 全天24小时不停的记录 我们只是记录了一周的内容 主要还是成本的原因,当然也有911事件的因素 从2001/9/11开始:一周内,全世界如何看这件事? CNN电视台说,巴勒斯坦人在大街上跳舞 是这样吗?让我们查看巴勒斯坦播出的电视节目,从中寻找答案 不去引用和对比之前发生过的事情 我们如何才能有判断性的思考呢? 之前电视节目是很少被记录和引用的 直至JonStewart的出现,他的工作让我们受益匪浅 我认为电视内容我们掌控的很好 每小时的影片需要15美元,每小时的电影约需要100到150美元 我们可以低价收集到这些材料 并把它们发布到网上 我们现在已经收集了很多这样的资料 大约有近10万份左右 图书,音乐,影片,软体 - 只有五万个标题 存在的问题主要是法律问题,打破版权(防拷贝)保护 尽管,我们已经克服了一些问题 但我们在华盛顿仍然存在一些实际问题
Well, we're best known as the World Wide Web. We've been archiving the World Wide Web since 1996. We take a snapshot of every website and all of the pages on it, every two months. And actually, it's really been pioneered by Alexa Internet, which donates this collection to the Internet Archive. And it's been growing along for the last 11 years, and it's a fantastic resource. And we've made a Wayback Machine that you can then go and see old websites kind of the way they were. If you go and search on something -- this is Google.com, the different versions of it that we have, this is what it looks like when it was an alpha release, and this is what it looked like at Stanford. So anyway, you've got basically an idea of where things came from. Mostly, people want to see their old stuff out of this. If there's one thing that we want to learn from the Library of Alexandria version one, which is probably best known for burning, is, don't just have one copy. So we've started to -- we've made another copy of all of this and we actually put it back in the Library of Alexandria. So this is a picture of the Internet Archive at the Library of Alexandria. And we now have also another copy building up in Amsterdam. So, we should put it in the San Andreas Fault Line in San Francisco, flood zone in Amsterdam and in the Middle East. Right, so anyway ... so we're hedging our bets here. If we go and put it in a couple more places, I think we'll be in good shape.
我们国家最为有名的是万维网 自从1996年以来,我们就着手于万维网的归档 平均每两个月,我们就会采集每个网站的所有网页的快照 事实上,做这件事情的先驱是AlexaInternet公司 AlexaInternet公司把他们收集的资料捐赠给了互联网档案馆 这些档案在之前的11年里一直在增加,已经是个了不起的资源 我们做了一个“历史回顾机” 你可以看到网站以往的页面 如果你想查询一些资料,比如到Google上 它看起来不同于我们现在使用的版本 这是Google发布的测试版 这是在斯坦福大学中被使用的Google版本 所以无论如何,你就会知道这些东西是来自哪里的 大部分人是希望看到他们以往的页面 如果我们能从最早的那个亚历山大图书馆得出些教训的话 就是曾经被大火焚烧的图书馆 不可只有一个备份 我们已经制作了另一个副本 然后我们把这些资料再放回亚历山大图书馆 这是亚历山大图书馆中互联网档案馆的一张照片 并且我们在阿姆斯特丹也建立了一个相同的档案中心 在旧金山的圣安地列斯断层线我们也应该建立一个 阿姆斯特丹的洪水区,包括中东地区。不管怎么说 我们在押赌注 如果能将之存放在更多的地方,它将会越安全
There's a political and social question out of this. Is all of this, as we go digital, is it going to be public or private? There's some large companies that have seen this vision, that are doing large-scale digitization, but they're locking up the public domain. The question is, is that the world that we really want to live in? What's the role of the public versus the private as things go forward? How do we go and have a world where we both have libraries and publishing in the future, just as we basically benefited as we were growing up? So universal access to all knowledge -- I think it can be one of the greatest achievements of humankind, like the man on the moon, or the Gutenberg Bible, or the Library of Alexandria. It could be something that we're remembered for, for millennia, for having achieved. And as I said before, I'll end with something that's carved above the door of the Carnegie Library. Carnegie -- one of the great capitalists of this country -- carved above his legacy, "Free to the People." Thank you very much.
此时就会出现一个政治和社会问题 那就是:当我们把这一切数字化后,这些资料是为公众还是私人服务? 一些大的公司已经预见到了这个场景 他们在做大规模的数字化 但是他们已经开始锁定公共领域了 问题是:这是我们所期望生存的未来世界吗? 随着时代的发展,如何限定公共与私人的角色? 在未来如何使图书馆和出版社 协调发展,使得双方都能成长起来,并从中受益? 全部知识的普世近用 我认为这将成为人类最伟大的成就之一 就像人类登月,古腾堡圣经,亚历山大图书馆 这是值得我们怀念的 人类千年发展史中的一些事情 我之前说过,我会以刻在卡内基图书馆 门上的一句话来结束我的演讲 卡内基——这个国家的一个伟大的资本家—— 在他的遗产上刻着:“对所有人免费。” 谢谢。