Towards Bengali Word Embedding: Corpus Creation, Intrinsic and Extrinsic Evaluations.
Md. Rajib HossainMohammed Moshiul HoquePublished in: ICON (2020)
Keyphrases
- statistical machine translation
- news corpus
- training corpus
- word frequencies
- text corpus
- machine translation system
- word alignment
- news articles
- natural language text
- word pairs
- named entities
- parallel corpus
- sentence level
- multiword
- translation model
- english words
- linguistic information
- machine translation
- word sense
- noun phrases
- unknown words
- co occurrence
- sentence pairs
- spontaneous speech
- lexical features
- text classification
- automatic summarization
- indian languages
- cross lingual
- named entity recognition
- vector space
- text corpora
- target language
- writing style
- word sense disambiguation
- stop words
- cross language information retrieval
- ambiguous words