HindEnCorp - Hindi-English and Hindi-only Corpus for Machine Translation.
Ondrej BojarVojtech DiatkaPavel RychlýPavel StranákVit SuchomelAles TamchynaDaniel ZemanPublished in: LREC (2014)
Keyphrases
- machine translation
- statistical machine translation
- cross lingual
- parallel corpora
- chinese english
- comparable corpora
- natural language processing
- target language
- parallel corpus
- machine translation system
- cross language information retrieval
- language independent
- word alignment
- source language
- brazilian portuguese
- phrase based smt
- information extraction
- language processing
- natural language
- training corpus
- natural language generation
- pos tagging
- mono lingual
- language resources
- machine learning
- query translation
- word level
- text mining
- indian languages