N-gram Weighting: Reducing Training Data Mismatch in Cross-Domain Language Model Estimation.
Bo-June Paul HsuJames R. GlassPublished in: EMNLP (2008)
Keyphrases
- n gram
- cross domain
- language model
- training data
- target domain
- language modeling
- language modelling
- document retrieval
- sentiment classification
- probabilistic model
- knowledge transfer
- bag of words
- information retrieval
- transfer learning
- retrieval model
- text categorization
- test data
- vector space model
- learning algorithm
- pseudo relevance feedback
- word segmentation
- decision trees
- query expansion
- test collection
- training set
- supervised learning
- query terms
- tf idf
- prior knowledge
- text retrieval
- text classification
- similarity measure
- part of speech
- relevance model
- active learning
- statistical language modeling
- document ranking
- weighting scheme
- reinforcement learning
- support vector machine
- labeled data
- unlabeled data
- naive bayes