Mining Atomic Chinese Abbreviation Pairs: A Probabilistic Model for Single Character Word Recovery.
Jing-Shin ChangWei-Lun TengPublished in: SIGHAN@COLING/ACL (2006)
Keyphrases
- probabilistic model
- word segmentation
- language model
- linguistic knowledge
- co occurrence
- chinese text
- pairwise
- data mining
- text mining
- web mining
- english text
- chinese characters
- word pairs
- cursive handwriting
- chinese word segmentation
- writing style
- mining algorithm
- pattern mining
- sequential patterns
- n gram
- knowledge discovery