Discrimination between Arabic and Latin from bilingual documents
Sofiene HaboubiSamia MaddouriHamid AmiriPublished in: CoRR (2012)
Keyphrases
- arabic documents
- handwritten documents
- word spotting
- arabic language
- parallel corpora
- character n grams
- information retrieval
- optical character recognition
- printed documents
- language identification
- document collections
- document classification
- multiword
- cross language
- indian languages
- document level
- document analysis
- feature selection
- machine translation
- document clustering
- text retrieval
- metadata
- document images
- xml documents
- web documents
- information retrieval systems
- bilingual lexicon
- word forms
- parallel corpus
- n gram
- retrieval systems
- text documents
- document retrieval
- language model
- natural language
- arabic text