閱讀全文 | |
篇名 |
词性赋码语料库的检索与正则表达式的编写
|
---|---|
並列篇名 | Querying part-of-speech tagged corpora and the compilation of regular expressions |
作者 | 梁茂成 |
中文摘要 | 标注可以为语料库带来增值(added value)(Leech 1997),这一思想已经逐渐成为语料库语言学界的共识,因而标注语料库也逐渐成为大型语料库最基本的规范之一。在外语教学与研究中,我们常常可以利用功能强大的正则表达式(regular expressions)对词性赋码语料库进行检索并从中提取各种所需信息。然而,由于正则表达式中所使用的各种符号有别于自然语言中的词语,对于绝大部分从事语言教学、语言学习和语言研究的人来说不无难度,又由于检索是语料库操作中最重要的环节之一,如何有效使用正则表达式对语料库进行检索成为语料库教学和研究中的难题之一。 |
英文摘要 | Corpus annotation can bring “added value” to corpora, and annotation has become a standard practice for most large-scale corpora. In foreign language teaching and research, practitioners often feel the need to query part-of-speech tagged corpora with self-compiled regular expressions (regex) so as to retrieve various needed information. However, due to the complexity of regular expressions and the somewhat weird symbols used therein, regex-enhanced corpus query has not been an easy job for most language teachers and researchers. How to make effective use of regular expressions in corpus query, which is central to almost all corpus-related work, becomes an embarrassing problem for most language teachers and researchers. |
起訖頁 | 065-081 |
關鍵詞 | 语料库、标注、检索、正则表达式 |
刊名 | 中國外語教育 |
期數 | 200905 (2:2期) |
出版單位 | 外語教學與研究出版社 |
該期刊 上一篇
| 问答、交互与课堂话语--一位高校英语专业教师的课堂话语个案分析报告 |
該期刊 下一篇
| 《言语产生与第二语言习得》述评 |