Language identification of on-line documents using word shapes.
Nicola NobileSabine BerglerChing Y. SuenSami KhouryPublished in: ICDAR (1997)
Keyphrases
- language identification
- indian languages
- english text
- text lines
- document images
- information retrieval
- document collections
- speaker identification
- keywords
- document analysis
- cross lingual
- document clustering
- retrieval systems
- information retrieval systems
- word segmentation
- web information retrieval
- n gram
- optical character recognition
- text documents