Augmenting Training Data for Low-Resource Neural Machine Translation via Bilingual Word Embeddings and BERT Language Modelling.
Akshai RameshHaque Usuf UhanaVenkatesh Balavadhani ParthasarathyRejwanul HaqueAndy WayPublished in: IJCNN (2021)
Keyphrases
- machine translation
- language modelling
- n gram
- word alignment
- word sense disambiguation
- statistical machine translation
- language model
- cross lingual
- word level
- machine translation system
- parallel corpus
- target language
- chinese english
- statistical translation models
- language independent
- english chinese
- source language
- cross language information retrieval
- information extraction
- language modeling
- natural language processing
- translation model
- parallel corpora
- query translation
- natural language
- bilingual lexicon
- bilingual dictionaries
- tf idf
- text classification
- pseudo relevance feedback
- bag of words
- term frequency
- information retrieval
- co occurrence
- relevance model
- knn
- document retrieval
- query expansion
- ad hoc retrieval
- word pairs
- language resources
- probabilistic model
- vector space model
- vector space