CoLI-Machine Learning Approaches for Code-mixed Language Identification at the Word Level in Kannada-English Texts.
H. L. ShashirekhaFazlourrahman BalouchzahiM. D. AnushaGrigori SidorovPublished in: CoRR (2022)
Keyphrases
- machine learning approaches
- language identification
- word level
- document images
- indian languages
- optical character recognition
- english text
- machine learning methods
- text lines
- document analysis
- machine learning
- character recognition
- machine learning algorithms
- language independent
- keywords
- sentence level
- data mining methods
- word segmentation
- machine vision
- text retrieval
- feature vectors