Hua Yu: A Word-segmented and Part-Of-Speech Tagged Chinese Corpus.
Maosong SunHonglin SunChangning HuangZhang PuXing HongbingZhou QiangPublished in: LREC (2000)
Keyphrases
- part of speech
- unknown words
- pos tagging
- chinese word segmentation
- word segmentation
- n gram
- training corpus
- noun phrases
- natural language processing
- tree bank
- linguistic features
- multiword
- word sense disambiguation
- linguistic information
- lexical information
- pos taggers
- syntactic information
- word sense
- syntactic features
- syntactic categories
- penn treebank
- parse tree
- morphological analysis
- tf idf
- dependency parser
- text documents
- ambiguous words
- dependency parsing
- text mining
- topic tracking
- information extraction
- document analysis