Detecting "protein words" through unsupervised word segmentation.
Liang WangKaiyong ZhaoPublished in: F1000Research (2015)
Keyphrases
- word segmentation
- pos tagging
- chinese text retrieval
- n gram
- word recognition
- chinese text
- chinese word segmentation
- unknown words
- language independent
- handwriting recognition
- statistical language modeling
- text classification
- supervised learning
- word level
- semi supervised
- language modeling
- retrieval systems
- sparse data
- co occurrence
- bayesian networks