Document-aligned Japanese-English Conversation Parallel Corpus.
Matiss RiktersRyokan RiTong LiToshiaki NakazawaPublished in: CoRR (2020)
Keyphrases
- parallel corpus
- cross lingual
- source language
- document clustering
- machine translation
- query translation
- cross language information retrieval
- target language
- word alignment
- language independent
- machine translation system
- latent semantic analysis
- semantic space
- sentence pairs
- document images
- natural language
- information retrieval
- document collections
- statistical machine translation
- vector space model
- document classification
- text documents
- document representation
- document retrieval
- web documents
- information retrieval systems
- query terms
- retrieval systems
- word level
- tf idf
- relevant documents
- bilingual dictionaries
- search engine