A New Corpus for Low-Resourced Sindhi Language with Word Embeddings.
Wazir AliJay KumarJunyu LuZenglin XuPublished in: CoRR (2019)
Keyphrases
- parallel corpus
- word frequencies
- linguistic knowledge
- word pairs
- english words
- machine translation system
- probabilistic context free grammars
- text corpus
- target language
- statistical machine translation
- spanish language
- english text
- language specific
- cross lingual
- programming language
- lexical features
- unknown words
- co occurrence
- lexical information
- language independent
- sentence level
- word sense
- noun phrases
- language processing
- sentence pairs
- natural language
- natural language text
- language learning
- n gram
- word frequency
- computational linguistics
- machine translation
- dimensionality reduction
- source language
- word order
- word meanings
- information retrieval
- chinese text retrieval