LearnLexTo: a machine-learning based word segmentation for indexing Thai texts.
Choochart HaruechaiyasakSarawoot KongyoungChaianun DamrongratPublished in: CIKM-iNEWS (2008)
Keyphrases
- word segmentation
- chinese text retrieval
- machine learning
- text classification
- document analysis
- n gram
- handwriting recognition
- word recognition
- chinese text
- chinese word segmentation
- pos tagging
- language independent
- natural language
- text mining
- active learning
- data mining
- word level
- information retrieval
- computer vision
- feature selection
- text documents
- pattern recognition
- machine learning algorithms
- reinforcement learning
- knowledge representation
- cross lingual
- language modeling
- natural language processing
- unknown words
- artificial intelligence
- information extraction
- feature space
- data analysis
- transfer learning
- question answering