Word Segmentation on Micro-Blog Texts with External Lexicon and Heterogeneous Data.
Qingrong XiaZhenghua LiJiayuan ChaoMin ZhangPublished in: NLPCC/ICCPOL (2016)
Keyphrases
- heterogeneous data
- word segmentation
- micro blog
- data integration
- data management
- natural language
- data sources
- databases
- metadata
- n gram
- language independent
- social networks
- complex data
- word recognition
- online social networks
- text classification
- sentiment analysis
- cross lingual
- micro blogging
- language modeling
- real world
- user generated content
- text documents
- document analysis
- keywords
- web data
- high dimensional data
- information sources
- data model
- query processing