Summarization and Categorization of Text Data in High-Level Data Cleaning for Information Retrieval.
M. SaravananP. C. Reghu RajS. RamanPublished in: Appl. Artif. Intell. (2003)
Keyphrases
- text data
- data cleaning
- text classification
- text mining
- information retrieval
- text categorization
- information extraction
- document collections
- text documents
- feature selection
- bag of words
- machine learning
- data integration
- natural language processing
- high dimensional
- structured data
- knn
- data quality
- information retrieval systems
- data processing
- knowledge discovery
- record linkage
- database
- outlier detection
- semi structured
- search engine
- question answering
- query expansion
- data management
- named entities
- data warehousing
- pattern recognition
- web usage mining
- data mining