Word Segmentation on Micro-blog Texts with External Lexicon and Heterogeneous Data.
Qingrong XiaZhenghua LiJiayuan ChaoMin ZhangPublished in: CoRR (2016)
Keyphrases
- heterogeneous data
- word segmentation
- micro blog
- data integration
- data sources
- n gram
- natural language
- complex data
- word recognition
- databases
- data management
- metadata
- micro blogging
- language independent
- text classification
- sentiment analysis
- online social networks
- social networks
- language modeling
- information sources
- text documents
- document analysis
- user generated content
- pattern recognition
- web data
- keywords
- cross lingual
- high dimensional data
- image processing
- user profiles
- information extraction