Construction of Scholarly n-Gram from Huge Text Data.
Myunggwon HwangHa Neul YeomMi-Nyeong HwangHanmin JungPublished in: IMIS (2014)
Keyphrases
- n gram
- text data
- text classification
- text mining
- bag of words
- text documents
- text categorization
- feature selection
- language model
- high dimensional
- machine learning
- digital libraries
- variable length
- structured data
- language modeling
- naive bayes
- word segmentation
- text classifiers
- knn
- labeled data
- document collections
- high dimensional data
- statistical language modeling
- dimensionality reduction
- k nearest neighbor
- metadata
- document representation
- relational databases
- unsupervised learning
- natural language processing