|
Modern English language database hits 1 billion words
本文作者: AGENCIES
据美联社等媒体报道,英国牛津大学出版社近日宣布,“牛津英语语料库”收集的英文词条已突破10亿,其中包括网络英语等21世纪现代英语。此语料库是最新版《牛津英语词典》编撰的基础数据库,反映了英语语言应用的新潮流。 A MASSIVE language research database responsible for bringing words such as "podcast'' and "celebutante'' to the pages of the Oxford dictionaries has officially hit a total of 1 billion words, researchers said. Drawing on sources such as weblogs, chatrooms, newspapers, magazines and fiction, the Oxford English Corpus spots emerging trends in language usage to help guide lexicographers when composing the most recent editions of dictionaries. The publisher of the Oxford English Dictionary, considered one of the most comprehensive dictionaries of the language, added words such as "supersize''and "wiki'' to its pages in its most recent August 2005 edition. Oxford University Press lexicographer Catherine Soanes said the database is not a collection of 1 billion different words, but of sentences and other examples of the usage and spelling. "The corpus is purely 21st century English," said Judy Pearsall, publishing manager of English dictionaries. "You're looking at current English and seeing what's happening right now. That's language at the cutting edge.'' As hybrid words such as "geek-chic'' or "inner-child'' increase in usage, Pearsall said part of the research project's goal is to identify words that have lasting power. "English gets really creative, really fun. What we're putting in dictionaries is words that will stick around,'' she said. Launched in January 2000, the Oxford English Corpus is part of the world's largest-funded language research project, costing US$90,000-107,000 per year. It has helped identify how the spellings of common phrases have changed, such as "fazed by'' to "phased by'' or "free rein'' to "free reign.'' "Buck naked'' has increasingly evolved to "butt naked.'' The corpus collects evidence from all the places where English is spoken, whether from North America, Britain, the Caribbean, Australia or India, to reflect the most current and common usage of the English language. The Oxford English Corpus is at the heart of dictionary-making in Oxford in the 21st century and ensures that the very latest developments in language today can be tracked and recorded. The Corpus can be used in many different ways to study the English language and cultures in which it is used. Because it is large, and because it is made up of text from many different subject areas and types of text, it acts as a representative slice of contemporary English from which all aspects of written language, from vocabulary and lexis to punctuation, discourse, and register, can be studied. |
|
主办
|
中报二十一世纪(北京)传媒科技有限公司版权所有,未经书面授权,禁止转载或建立镜像。 主办单位:中国日报社 Copyright by 21st Century English Education Media All Rights Reserved 版权所有 复制必究 网站信息网络传播视听节目许可证0108263 京ICP备2024066071号-1 京公网安备 11010502033664号 |