Keyphrases
- web documents
- document collections
- information retrieval
- extensible markup language
- database
- document images
- information retrieval systems
- retrieval systems
- structured documents
- document retrieval
- document clustering
- feature selection
- machine learning
- data mining
- document classification
- information integration
- markup language
- document processing
- keyword extraction
- case study
- website
- text documents
- vector space model
- databases