Tibetan Unknown Word Identification from News Corpora for Supporting Lexicon-based Tibetan Word Segmentation.
Minghua NuoHuidan LiuCongjun LongJian WuPublished in: ACL (2) (2015)
Keyphrases
- word segmentation
- unknown words
- n gram
- word recognition
- language independent
- pos tagging
- text classification
- cross lingual
- document analysis
- morphological analysis
- natural language processing
- language modeling
- sparse data
- artificial intelligence
- data mining
- news articles
- text categorization
- handwritten documents
- natural language