Machine learning based framework for fine-grained word segmentation and enhanced text normalization for low resourced language.
Shahzad NazirMuhammad AsifMariam RehmanShahbaz AhmadPublished in: PeerJ Comput. Sci. (2024)
Keyphrases
- fine grained
- machine learning
- coarse grained
- word segmentation
- text mining
- access control
- natural language
- chinese text
- document analysis
- text analysis
- chinese text retrieval
- language independent
- n gram
- wordnet
- text classification
- language model
- web search
- information extraction
- high dimensional
- feature selection