Discriminative Boosting from Dictionary and Raw Text - A Novel Approach to Build A Chinese Word Segmenter.
Fandong MengWenbin JiangHao XiongQun LiuPublished in: COLING (Posters) (2012)
Keyphrases
- word segmentation
- chinese text
- english words
- text corpus
- english text
- chinese texts
- word level
- unknown words
- document analysis
- word recognition
- named entity recognizer
- chinese word segmentation
- n gram
- training corpus
- handwritten documents
- compound words
- language independent
- chinese english
- natural language text
- string matching
- sparse codes
- text classification
- information retrieval
- text retrieval
- related words
- feature selection
- word counts
- sparse representation
- text summarization
- noun phrases
- sparse coding
- linguistic information
- parallel corpus
- lexical information
- multiword
- keywords
- relation extraction
- sentence level
- discriminative classifiers
- broadcast news
- word pairs
- text input
- semi supervised
- word sense