一、访问方式
https://www.english-corpora.org/
校内IP地址访问,注册个人账号登录使用,无并发用户限制。
二、使用期限
2025/01/06-2026/01/06
三、数据库简介
English Corpora是目前使用最广泛的语料库之一,平台整合了多个常用的语料库资源,如:
1) 美国当代英语语料库COCA(The Corpus of Contemporary American English):包含19万篇文本中约4.5亿个单词,收录年度1990-2012,分为话语、小说、杂志、报纸、学术5大类。
2) 美国历史英语语料库COHA(The Corpus of Historical American English):包含11.5万篇文本中约4亿个单词,收录年度1810-2009。
3) 全球网络英语语料库GloWbE(Corpus of Global Web-based English):包含19亿个单词,来自20多个英语国家超过180万个网页。
4) 英语国家语料库BNC(British National Corpups):包含1亿个单词,1980-1993。BNC语料库最初由牛津大学出版社于1980-1990年建立。English Corpora平台收录BNC完整的语料信息,采用的版本为CLAWS 7 tagset。
以下语料库资源均可以在平台检索并查看词条详细信息。
Corpus (online access) | # words | Dialect | Time period | Genre(s) |
News on the Web (NOW) | 14.2 billion+ | 20 countries | 2010-yesterday | Web: News |
iWeb: The Intelligent Web-based Corpus | 14 billion | 6 countries | 2017 | Web |
Global Web-Based English (GloWbE) | 1.9 billion | 20 countries | 2012-13 | Web (incl blogs) |
Wikipedia Corpus | 1.9 billion | (Various) | 2014 | Wikipedia |
Coronavirus Corpus | 1.3 billion+ | 20 countries | Jan 2020-yesterday | Web: News |
Corpus of Contemporary American English (COCA) | 1.0 billion | American | 1990-2019 | Balanced |
Corpus of Historical American English (COHA) | 475 million | American | 1820-2019 | Balanced |
The TV Corpus | 325 million | 6 countries | 1950-2018 | TV shows |
The Movie Corpus | 200 million | 6 countries | 1930-2018 | Movies |
Corpus of American Soap Operas | 100 million | American | 2001-2012 | TV shows |
Hansard Corpus | 1.6 billion | British | 1803-2005 | Parliament |
Early English Books Online | 755 million | British | 1470s-1690s | (Various) |
Corpus of US Supreme Court Opinions | 130 million | American | 1790s-present | Legal opinions |
TIME Magazine Corpus | 100 million | American | 1923-2006 | Magazine |
British National Corpus (BNC) * | 100 million | British | 1980s-1993 | Balanced |
Strathy Corpus (Canada) | 50 million | Canadian | 1970s-2000s | Balanced |
CORE Corpus | 50 million | 6 countries | 2014 | Web |
American English | 155 billion | American | 1500s-2000s | (Various) |
British English | 34 billion | British | 1500s-2000 | (Various) |