Automatic clustering of part-of-speech for vocabulary divided PLSA language model.
Motoyuki SuzukiNaoto KuriyamaAkinori ItoShozo MakinoPublished in: NLPKE (2008)
Keyphrases
- n gram
- language model
- part of speech
- probabilistic model
- language modeling
- out of vocabulary
- document retrieval
- context sensitive
- retrieval model
- test collection
- multiword
- clustering method
- linguistic features
- bag of words
- query expansion
- information retrieval
- clustering algorithm
- query terms
- text classification
- vector space model
- translation model
- k means
- unsupervised learning
- word sense disambiguation
- document clustering
- tf idf
- cluster analysis
- word segmentation
- natural language processing
- visual features
- semi supervised
- machine learning
- co occurrence
- statistical machine translation