Word Segmentation for Text in Japanese Ancient Writings Based on Probability of Character N-Grams.
Mamoru YoshimuraFuminori KimuraAkira MaedaPublished in: ICADL (2012)
Keyphrases
- word segmentation
- character n grams
- n gram
- chinese text
- variable length
- language independent
- language model
- word recognition
- language modeling
- text classification
- document analysis
- bag of words
- word level
- handwritten documents
- out of vocabulary
- cross language
- part of speech
- information retrieval
- language specific
- text documents
- text analysis
- cross lingual
- text retrieval
- question answering
- web documents
- vision system
- active learning