JParaCrawl: A Large Scale Web-Based English-Japanese Parallel Corpus.
Makoto MorishitaJun SuzukiMasaaki NagataPublished in: LREC (2020)
Keyphrases
- parallel corpus
- cross lingual
- machine translation
- cross language information retrieval
- language independent
- machine translation system
- query translation
- word alignment
- target language
- statistical machine translation
- cross language
- sentence pairs
- source language
- language modeling
- multimedia
- parallel corpora
- information retrieval
- latent semantic analysis
- relevance feedback
- machine learning