Boosting the Feature Space: Text Classification for Unstructured Data on the Web.
Yang SongDing ZhouJian HuangIsaac G. CouncillHongyuan ZhaC. Lee GilesPublished in: ICDM (2006)
Keyphrases
- unstructured data
- text classification
- textual data
- feature space
- unstructured information
- feature selection
- text mining
- structured data
- text categorization
- semi structured data
- text data
- text documents
- structured and unstructured data
- big data
- website
- semi structured
- bag of words
- information extraction
- high dimensional
- web data
- machine learning
- principal component analysis
- feature vectors
- web mining
- database
- image retrieval
- textual information
- web documents
- k nearest neighbor
- relational databases
- data warehouse
- raw data
- metadata
- web pages
- information retrieval
- databases
- n gram
- topic models
- image representation
- labeled data
- low dimensional
- natural language processing
- knn
- data points