Empirical Use of Information Retrieval to Build Synthetic Data for SMT Domain Adaptation.
Sadaf Abdul-RaufHolger SchwenkPatrik LambertMohammad NawazPublished in: IEEE ACM Trans. Audio Speech Lang. Process. (2016)
Keyphrases
- synthetic data
- domain adaptation
- information retrieval
- cross domain
- labeled data
- multiple sources
- real world
- sentiment classification
- semi supervised
- data sets
- semi supervised learning
- language modeling
- machine learning
- information retrieval systems
- test data
- target domain
- search engine
- unlabeled data
- document classification
- test collection
- text mining
- query expansion
- question answering
- covariate shift
- feature selection
- test cases
- information extraction
- similarity measure
- document collections
- active learning
- language model
- domain specific
- knowledge discovery