Script Normalization for Unconventional Writing of Under-Resourced Languages in Bilingual Communities.
Sina AhmadiAntonios AnastasopoulosPublished in: CoRR (2023)
Keyphrases
- indian languages
- cross lingual
- language identification
- query translation
- parallel corpora
- comparable corpora
- machine translation
- language resources
- target language
- statistical machine translation
- language independent
- document images
- cross lingual information retrieval
- cross language
- sentence pairs
- cross language information retrieval
- machine translation system
- source language
- bilingual dictionaries
- expressive power
- normalization method
- learning community
- word alignment
- community detection
- news articles
- machine readable dictionaries
- databases
- bilingual lexicon
- social networks
- virtual communities
- complex networks
- social network analysis
- text classification
- web communities
- grammatical inference
- sentence level
- collaborative writing
- text summarization
- community structure
- character n grams