Synthetic Cross-language Information Retrieval Training Data.
James MayfieldEugene YangDawn J. LawrieSamuel BarhamOrion WellerMarc MasonSuraj NairScott MillerPublished in: CoRR (2023)
Keyphrases
- question answering
- cross language information retrieval
- cross language
- training data
- query translation
- natural language processing
- information retrieval
- training set
- parallel corpora
- machine translation
- english chinese
- translation model
- terminology extraction
- language resources
- learning algorithm
- decision trees
- out of vocabulary
- comparable corpora
- supervised learning
- multilingual information retrieval
- multilingual information access
- parallel texts
- machine transliteration
- labeled data
- query terms
- semi supervised learning
- statistical machine translation
- bilingual dictionaries
- parallel corpus
- artificial intelligence
- bilingual lexicon