N-gram and Gazetteer List Based Named Entity Recognition for Urdu: A Scarce Resourced Language.
Faryal JahangirWaqas AnwarUsama Ijaz BajwaXuan WangPublished in: ALR@COLING (2012)
Keyphrases
- named entity recognition
- n gram
- named entities
- language specific
- information extraction
- natural language processing
- character n grams
- out of vocabulary
- language model
- maximum entropy
- semi supervised
- text summarization
- conditional random fields
- text classification
- natural language
- language independent
- variable length
- annotated corpus
- relation extraction
- language modeling
- part of speech
- web documents
- word segmentation
- text mining
- sentiment analysis
- linguistic features
- machine learning
- question answering
- word sense disambiguation
- text categorization
- image segmentation
- decision trees