Studying the Effect of Input Size for Bayesian Word Segmentation on the Providence Corpus.
Benjamin BörschingerKatherine DemuthMark JohnsonPublished in: COLING (2012)
Keyphrases
- word segmentation
- pos tagging
- unknown words
- word recognition
- language independent
- bayesian networks
- chinese text
- n gram
- chinese word segmentation
- chinese text retrieval
- language modeling
- text classification
- part of speech
- dependency parsing
- language model
- information retrieval systems
- text mining
- high dimensional
- search engine