Now, have any of y'all ever looked up this word? You know, in a dictionary? (Laughter) Yeah, that's what I thought. How about this word? Here, I'll show it to you. Lexicography: the practice of compiling dictionaries. Notice -- we're very specific -- that word "compile." The dictionary is not carved out of a piece of granite, out of a lump of rock. It's made up of lots of little bits. It's little discrete -- that's spelled D-I-S-C-R-E-T-E -- bits. And those bits are words.
Now one of the perks of being a lexicographer -- besides getting to come to TED -- is that you get to say really fun words, like lexicographical. Lexicographical has this great pattern: it's called a double dactyl. And just by saying double dactyl, I've sent the geek needle all the way into the red. (Laughter) (Applause) But "lexicographical" is the same pattern as "higgledy-piggledy." Right? It's a fun word to say, and I get to say it a lot. Now, one of the non-perks of being a lexicographer is that people don't usually have a kind of warm, fuzzy, snuggly image of the dictionary. Right? Nobody hugs their dictionaries. But what people really often think about the dictionary is, they think more like this. Just to let you know, I do not have a lexicographical whistle. But people think that my job is to let the good words make that difficult left-hand turn into the dictionary, and keep the bad words out.
But the thing is, I don't want to be a traffic cop. For one thing, I just do not do uniforms. And for another, deciding what words are good and what words are bad is actually not very easy. And it's not very fun. And when parts of your job are not easy or fun, you kind of look for an excuse not to do them. So if I had to think of some kind of occupation as a metaphor for my work, I would much rather be a fisherman. I want to throw my big net into the deep, blue ocean of English and see what marvelous creatures I can drag up from the bottom. But why do people want me to direct traffic, when I would much rather go fishing? Well, I blame the Queen. Why do I blame the Queen? Well, first of all, I blame the Queen because it's funny. But secondly, I blame the Queen because dictionaries have really not changed.
Our idea of what a dictionary is has not changed since her reign. The only thing that Queen Victoria would not be amused by in modern dictionaries is our inclusion of the F-word, which has happened in American dictionaries since 1965. So, there's this guy, right? Victorian era. James Murray, first editor of the Oxford English Dictionary. I do not have that hat. I wish I had that hat. So he's really responsible for a lot of what we consider modern in dictionaries today. When a guy who looks like that, in that hat, is the face of modernity, you have a problem. And so, James Murray could get a job on any dictionary today. There'd be virtually no learning curve.
And of course, a few of us are saying: okay, computers! Computers! What about computers? The thing about computers is, I love computers. I mean, I'm a huge geek, I love computers. I would go on a hunger strike before I let them take away Google Book Search from me. But computers don't do much else other than speed up the process of compiling dictionaries. They don't change the end result. Because what a dictionary is, is it's Victorian design merged with a little bit of modern propulsion. It's steampunk. What we have is an electric velocipede. You know, we have Victorian design with an engine on it. That's all! The design has not changed.
And OK, what about online dictionaries, right? Online dictionaries must be different. This is the Oxford English Dictionary Online, one of the best online dictionaries. This is my favorite word, by the way. Erinaceous: pertaining to the hedgehog family; of the nature of a hedgehog. Very useful word. So, look at that. Online dictionaries right now are paper thrown up on a screen. This is flat. Look how many links there are in the actual entry: two! Right? Those little buttons, I had them all expanded except for the date chart. So there's not very much going on here. There's not a lot of clickiness. And in fact, online dictionaries replicate almost all the problems of print, except for searchability. And when you improve searchability, you actually take away the one advantage of print, which is serendipity. Serendipity is when you find things you weren't looking for, because finding what you are looking for is so damned difficult.
So -- (Laughter) (Applause) -- now, when you think about this, what we have here is a ham butt problem. Does everyone know the ham butt problem? Woman's making a ham for a big, family dinner. She goes to cut the butt off the ham and throw it away, and she looks at this piece of ham and she's like, "This is a perfectly good piece of ham. Why am I throwing this away?" She thought, "Well, my mom always did this." So she calls up mom, and she says, "Mom, why'd you cut the butt off the ham, when you're making a ham?" She says, "I don't know, my mom always did it!" So they call grandma, and grandma says, "My pan was too small!" (Laughter)
So, it's not that we have good words and bad words. We have a pan that's too small! You know, that ham butt is delicious! There's no reason to throw it away. The bad words -- see, when people think about a place and they don't find a place on the map, they think, "This map sucks!" When they find a nightspot or a bar, and it's not in the guidebook, they're like, "Ooh, this place must be cool! It's not in the guidebook." When they find a word that's not in the dictionary, they think, "This must be a bad word." Why? It's more likely to be a bad dictionary. Why are you blaming the ham for being too big for the pan? So, you can't get a smaller ham. The English language is as big as it is.
So, if you have a ham butt problem, and you're thinking about the ham butt problem, the conclusion that it leads you to is inexorable and counterintuitive: paper is the enemy of words. How can this be? I mean, I love books. I really love books. Some of my best friends are books. But the book is not the best shape for the dictionary. Now they're going to think "Oh, boy. People are going to take away my beautiful, paper dictionaries?" No. There will still be paper dictionaries. When we had cars -- when cars became the dominant mode of transportation, we didn't round up all the horses and shoot them. You know, there're still going to be paper dictionaries, but it's not going to be the dominant dictionary. The book-shaped dictionary is not going to be the only shape dictionaries come in. And it's not going to be the prototype for the shapes dictionaries come in.
So, think about it this way: if you've got an artificial constraint, artificial constraints lead to arbitrary distinctions and a skewed worldview. What if biologists could only study animals that made people go, "Aww." Right? What if we made aesthetic judgments about animals, and only the ones we thought were cute were the ones that we could study? We'd know a whole lot about charismatic megafauna, and not very much about much else. And I think this is a problem. I think we should study all the words, because when you think about words, you can make beautiful expressions from very humble parts. Lexicography is really more about material science. We are studying the tolerances of the materials that you use to build the structure of your expression: your speeches and your writing. And then, often people say to me, "Well, OK, how do I know that this word is real?" They think, "OK, if we think words are the tools that we use to build the expressions of our thoughts, how can you say that screwdrivers are better than hammers? How can you say that a sledgehammer is better than a ball-peen hammer?" They're just the right tools for the job.
And so people say to me, "How do I know if a word is real?" You know, anybody who's read a children's book knows that love makes things real. If you love a word, use it. That makes it real. Being in the dictionary is an artificial distinction. It doesn't make a word any more real than any other way. If you love a word, it becomes real. So if we're not worrying about directing traffic, if we've transcended paper, if we are worrying less about control and more about description, then we can think of the English language as being this beautiful mobile. And any time one of those little parts of the mobile changes, is touched, any time you touch a word, you use it in a new context, you give it a new connotation, you verb it, you make the mobile move. You didn't break it. It's just in a new position, and that new position can be just as beautiful.
Now, if you're no longer a traffic cop -- the problem with being a traffic cop is there can only be so many traffic cops in any one intersection, or the cars get confused. Right? But if your goal is no longer to direct the traffic, but maybe to count the cars that go by, then more eyeballs are better. You can ask for help! If you ask for help, you get more done. And we really need help. Library of Congress: 17 million books, of which half are in English. If only one out of every 10 of those books had a word that's not in the dictionary in it, that would be equivalent to more than two unabridged dictionaries.
And I find an un-dictionaried word -- a word like "un-dictionaried," for example -- in almost every book I read. What about newspapers? Newspaper archive goes back to 1759, 58.1 million newspaper pages. If only one in 100 of those pages had an un-dictionaried word on it, it would be an entire other OED. That's 500,000 more words. So that's a lot. And I'm not even talking about magazines. I'm not talking about blogs -- and I find more new words on BoingBoing in a given week than I do Newsweek or Time. There's a lot going on there.
And I'm not even talking about polysemy, which is the greedy habit some words have of taking more than one meaning for themselves. So if you think of the word "set," a set can be a badger's burrow, a set can be one of the pleats in an Elizabethan ruff, and there's one numbered definition in the OED. The OED has 33 different numbered definitions for set. Tiny, little word, 33 numbered definitions. One of them is just labeled "miscellaneous technical senses." Do you know what that says to me? That says to me, it was Friday afternoon and somebody wanted to go down the pub. (Laughter) That's a lexicographical cop out, to say, "miscellaneous technical senses."
So, we have all these words, and we really need help! And the thing is, we could ask for help -- asking for help's not that hard. I mean, lexicography is not rocket science. See, I just gave you a lot of words and a lot of numbers, and this is more of a visual explanation. If we think of the dictionary as being the map of the English language, these bright spots are what we know about, and the dark spots are where we are in the dark. If that was the map of all the words in American English, we don't know very much. And we don't even know the shape of the language. If this was the dictionary -- if this was the map of American English -- look, we have a kind of lumpy idea of Florida, but there's no California! We're missing California from American English. We just don't know enough, and we don't even know that we're missing California. We don't even see that there's a gap on the map.
So again, lexicography is not rocket science. But even if it were, rocket science is being done by dedicated amateurs these days. You know? It can't be that hard to find some words! So, enough scientists in other disciplines are really asking people to help, and they're doing a good job of it. For instance, there's eBird, where amateur birdwatchers can upload information about their bird sightings. And then, ornithologists can go and help track populations, migrations, etc.
And there's this guy, Mike Oates. Mike Oates lives in the U.K. He's a director of an electroplating company. He's found more than 140 comets. He's found so many comets, they named a comet after him. It's kind of out past Mars. It's a hike. I don't think he's getting his picture taken there anytime soon. But he found 140 comets without a telescope. He downloaded data from the NASA SOHO satellite, and that's how he found them. If we can find comets without a telescope, shouldn't we be able to find words?
Now, y'all know where I'm going with this. Because I'm going to the Internet, which is where everybody goes. And the Internet is great for collecting words, because the Internet's full of collectors. And this is a little-known technological fact about the Internet, but the Internet is actually made up of words and enthusiasm. And words and enthusiasm actually happen to be the recipe for lexicography. Isn't that great? So there are a lot of really good word-collecting sites out there right now, but the problem with some of them is that they're not scientific enough. They show the word, but they don't show any context. Where did it come from? Who said it? What newspaper was it in? What book?
Because a word is like an archaeological artifact. If you don't know the provenance or the source of the artifact, it's not science, it's a pretty thing to look at. So a word without its source is like a cut flower. You know, it's pretty to look at for a while, but then it dies. It dies too fast. So, this whole time I've been saying, "The dictionary, the dictionary, the dictionary, the dictionary." Not "a dictionary," or "dictionaries." And that's because, well, people use the dictionary to stand for the whole language. They use it synecdochically. And one of the problems of knowing a word like "synecdochically" is that you really want an excuse to say "synecdochically." This whole talk has just been an excuse to get me to the point where I could say "synecdochically" to all of you. So I'm really sorry. But when you use a part of something -- like the dictionary is a part of the language, or a flag stands for the United States, it's a symbol of the country -- then you're using it synecdochically. But the thing is, we could make the dictionary the whole language. If we get a bigger pan, then we can put all the words in. We can put in all the meanings. Doesn't everyone want more meaning in their lives? And we can make the dictionary not just be a symbol of the language -- we can make it be the whole language.
You see, what I'm really hoping for is that my son, who turns seven this month -- I want him to barely remember that this is the form factor that dictionaries used to come in. This is what dictionaries used to look like. I want him to think of this kind of dictionary as an eight-track tape. It's a format that died because it wasn't useful enough. It wasn't really what people needed. And the thing is, if we can put in all the words, no longer have that artificial distinction between good and bad, we can really describe the language like scientists. We can leave the aesthetic judgments to the writers and the speakers. If we can do that, then I can spend all my time fishing, and I don't have to be a traffic cop anymore. Thank you very much for your kind attention.