Mining Training Data for Language Modeling Across the World's Languages.
Manasa PrasadTheresa BreinerDaan van EschPublished in: SLTU (2018)
Keyphrases
- language modeling
- cross lingual
- training data
- language model
- expert finding
- comparable corpora
- retrieval model
- information retrieval
- query expansion
- language independent
- probabilistic model
- n gram
- text classification
- web mining
- knowledge discovery
- cross language
- statistical language models
- classification accuracy
- statistical machine translation
- decision trees
- relevance model
- learning algorithm
- data sets
- translation model
- data mining
- retrieval effectiveness
- query terms
- text mining
- target language
- user queries
- parallel corpora
- frequent patterns
- improvements in retrieval effectiveness