Vietnamese Word Segmentation with CRFs and SVMs: An Investigation.
Cam-Tu NguyenTrung-Kien NguyenXuan Hieu PhanLe-Minh NguyenQuang-Thuy HaPublished in: PACLIC (2006)
Keyphrases
- word segmentation
- named entity recognition
- support vector
- conditional random fields
- named entities
- information extraction
- word recognition
- n gram
- multi class
- language independent
- feature selection
- maximum entropy
- chinese word segmentation
- pos tagging
- natural language processing
- chinese text
- text classification
- hidden markov models
- knn
- maximum margin
- graphical models
- semi supervised
- structured prediction
- machine learning
- training set
- text mining
- cross lingual
- training data
- pairwise
- language modeling
- document analysis
- feature vectors
- feature space
- pattern recognition
- machine vision
- image segmentation
- chinese text retrieval