Evaluation of a language identification system for mono- and multilingual text documents.
Olga ArtemenkoThomas MandlMargaryta ShramkoChrista Womser-HackerPublished in: SAC (2006)
Keyphrases
- text documents
- language identification
- text mining
- text classification
- information extraction
- keywords
- text categorization
- topic models
- text data
- news articles
- wordnet
- document clustering
- bag of words
- speaker identification
- named entities
- semi supervised learning
- document images
- nearest neighbor
- bayesian networks
- indian languages
- machine learning
- cross lingual
- question answering
- knowledge representation
- multiscale