Generating a Category Set of Words Using a Hierarchical Part-of-Speech System and Tagged Corpus.
Takeyuki KojimaYoshiyuki KotaniPublished in: PACLIC (2002)
Keyphrases
- training corpus
- part of speech
- n gram
- multiword
- pos tagging
- linguistic information
- syntactic categories
- noun phrases
- linguistic features
- lexical information
- syntactic features
- natural language processing
- unknown words
- natural language text
- syntactic information
- semantic roles
- word sense disambiguation
- feature set
- chinese word segmentation
- pos taggers
- machine learning