An improved MDL-based compression algorithm for unsupervised word segmentation.
Ruey-Cheng ChenPublished in: ACL (2) (2013)
Keyphrases
- compression algorithm
- word segmentation
- pos tagging
- chinese text retrieval
- data compression
- image compression
- compression ratio
- bitstream
- n gram
- word recognition
- language independent
- text classification
- chinese word segmentation
- quadtree decomposition
- chinese text
- semi supervised
- document analysis
- cross lingual
- word level
- language modeling
- machine learning
- wavelet transform
- supervised learning
- computational complexity
- coding scheme
- computer vision