cEnTam: Creation and Validation of a New English-Tamil Bilingual Corpus.
Sanjanasri J. P.B. PremjithVijay Krishna MenonK. P. SomanPublished in: BUCC@LREC (2020)
Keyphrases
- parallel corpus
- sentence pairs
- indian languages
- statistical machine translation
- cross lingual
- parallel corpora
- machine translation
- chinese english
- machine translation system
- comparable corpora
- multiword
- cross language information retrieval
- word alignment
- query translation
- language independent
- target language
- english chinese
- word sense
- language identification
- cross language
- handwritten characters
- link grammar
- lexical knowledge
- bilingual dictionaries
- parallel texts
- word pairs
- language modeling
- document images
- handwritten character recognition
- source language
- english words
- training corpus
- language resources
- text classification
- bilingual lexicon
- proper names
- out of vocabulary
- linguistic resources
- text corpora
- wordnet
- person names
- wide coverage
- co occurrence
- natural language processing
- information extraction
- open domain
- translation model
- spoken language
- word level