Bayesian Unsupervised Word Segmentation with Nested Pitman-Yor Language Modeling.
Daichi MochihashiTakeshi YamadaNaonori UedaPublished in: ACL/IJCNLP (2009)
Keyphrases
- word segmentation
- language modeling
- pos tagging
- language model
- n gram
- dirichlet prior
- chinese text retrieval
- cross lingual
- retrieval model
- information retrieval
- probabilistic model
- query expansion
- handwriting recognition
- bayesian networks
- unsupervised learning
- supervised learning
- text classification
- word level
- language independent
- machine learning
- statistical language modeling
- learning algorithm
- part of speech
- document retrieval
- semi supervised
- tf idf
- feature selection
- test collection
- digital libraries
- text categorization
- feature space