手机APP下载

您现在的位置: 首页 > 英语单词 > VOA词汇大师 > 正文

词汇大师第241期:词汇统计

来源:可可英语 编辑:Jasmine   可可英语APP下载 |  可可官方微信:ikekenet
  下载MP3到电脑  [F8键暂停/播放]   批量下载MP3到手机

《词汇大师》今天讨论的是词汇的统计,如果将英语中所有单词按照最常用到最不常用单词的顺序排列出来,那将会是什么样子的呢?……

Broadcast on COAST TO COAST: September 2, 2004

AA: I'm Avi Arditti with Rosanne Skirble, and this week on Wordmaster: counting words.

RS: If you wanted to show people the 88,000 most common words in English, how would you do it? Jonathan Harris thought of a sentence — or something that looks like one. He works on interactive art projects. He laid out the words in a straight line, from the most frequently used to the least frequently used.

AA: This is all on a Web site, so you keep clicking to the right to read the words on the screen. Or you can look up specific words to see their ranking. There's also a visual trick that displays the words as a graph. The most common are in really big type; the least common are in really small type.

RS: Jonathan Harris is an artist in the field of "information visualization." What he created is wordcount-dot-org.

JONATHAN HARRIS: "The experience I was trying to create for the user was like an archeologist sort of sifting through sand. And you never really get a look at the whole language at any one time. You really have to zero in one specific part and explore there. And in this sense you can really spend hours just killing time on this and playing around."

RS: "You say it's like one very long sentence, but is there anything connecting these words?"

JONATHAN HARRIS: "That's what's really interesting, and this is the one aspect of WordCount that people have really gravitated toward, as I've found. Because the data is essentially random — I mean, it's not random, but the fact that a given word is next to another word is only based on how often those words appear in normal English usage. But when you have 88,000 words placed back to back, chances are pretty good that a few of those sequences are going to form some pretty conspiratorial meanings.

"Every morning I sort of come into work and I check my e-mail and I have a pile of e-mails waiting for me from people all around the globe that have found interesting sequences in WordCount. Some of my favorites are words 992 to 995 are 'American ensure oil opportunity.' Then 4304 to 4307 is 'Microsoft acquire salary tremendous.'"

AA: "I like this one, 5283 to 5285, which is 'angel seeks supper.'"

JONATHAN HARRIS: "Exactly. I found that a lot of people suggest that this be used as a good device for people trying to come up with a name for their band."

RS: "How is it determined, the frequency of any given word?"

JONATHAN HARRIS: "The frequency is data that is not generated by me. The frequency data was all coming from this source data that I used, which is the British National Corpus and that's a collection of written and spoken English words that were collected over a few years, I think back in the mid-1990s, by this group in England. It's a little bit dated; I've found one word that people are often surprised does not appear at all in the archive is blog. So clearly the phenomenon of Web logging came up after this data was collected."

AA: "So now you describe this basically as an 88,000-word-long sentence, starting with the word 'the,' the most frequently used word in the English language. What's at the other end?"

JONATHAN HARRIS: "The other end is surprising, and this is a big point of contention for a lot of people that actually find what the last word is. But the last word, surprisingly or not, is conquistador. And if you look through the list and you spend some time with it, you'll find that there are many words much, much further in front of conquistador that you've never even heard of. So clearly there seems to be some errata in their data."

AA: "So conquistador, as in a Spanish conqueror?"

JONATHAN HARRIS: "Some other interesting sort of comparative rankings: war is 304 and peace is 1,155. Love beats hate, Coke beats Pepsi and love beats sex by over 1,000."

AA: "Now this is according to British usage from a few years ago, right?"

JONATHAN HARRIS: "That's right, so maybe this has all changed since then. WordCount went online about five months ago, and almost nobody saw it for about four months. And then back at the beginning of July a friend of mine posted it on his blog and within about a day or two days, the site was getting about 20,000 unique visitors a day.

"And I was getting e-mails from all over the world, mainly people taking issue with some of the apparent disparities in the data, how some seemingly obscure words were being placed ahead of seemingly more common ones, but other people that were just sort of touched by how fun it was. And people, you know, found these little comparisons entertaining, like the Coke and Pepsi, and the love and the hate, and the war and the peace. Things like this."

RS: Jonathan Harris, talking to us from Fabrica, a creative think tank for young artists where he has a year-long fellowship. It located near Venice, Italy, and it's where he developed wordcount dot o-r-g.

AA: And that's all for this week. Our e-mail address is word@voanews.com. And our Web site is voanews.com/wordmaster. With Rosanne Skirble.

重点单词   查看全部解释    
describe [dis'kraib]

想一想再看

vt. 描述,画(尤指几何图形),说成

联想记忆
obscure [əb'skjuə]

想一想再看

adj. 微暗的,难解的,不著名的,[语音学]轻音的

联想记忆
unique [ju:'ni:k]

想一想再看

adj. 独一无二的,独特的,稀罕的

联想记忆
graph [grɑ:f]

想一想再看

n. 图表,示意图
vt. (以图表)表示

 
determined [di'tə:mind]

想一想再看

adj. 坚毅的,下定决心的

 
source [sɔ:s]

想一想再看

n. 发源地,来源,原始资料

 
entertaining [entə'teiniŋ]

想一想再看

adj. 引起乐趣的,娱乐性的,令人愉快的 n. 招待,

 
frequency ['fri:kwənsi]

想一想再看

n. 频繁,频率

 
fellowship ['feləuʃip]

想一想再看

n. 友谊,团体,会员资格,奖学金

 
apparent [ə'pærənt]

想一想再看

adj. 明显的,表面上的

 


关键字: 词汇 词汇大师 统计

发布评论我来说2句

    最新文章

    可可英语官方微信(微信号:ikekenet)

    每天向大家推送短小精悍的英语学习资料.

    添加方式1.扫描上方可可官方微信二维码。
    添加方式2.搜索微信号ikekenet添加即可。