Keyphrases
- document collections
- relevant documents
- document classification
- document clustering
- text documents
- web documents
- information retrieval systems
- digital documents
- document content
- information retrieval
- document representation
- document retrieval
- semi structured documents
- electronic documents
- retrieval systems
- document processing
- document analysis
- document similarity
- retrieved documents
- structured documents
- document structure
- multimedia documents
- document type
- document set
- document archives
- document ranking
- training documents
- document summarization
- unstructured documents
- vector space model
- similar documents
- textual content
- query terms
- document centric
- document images
- document repository
- keywords
- digital libraries
- document level
- index terms
- related documents
- latent semantic analysis
- textual documents
- semantic information
- text classification
- scanned documents
- xml format
- scientific documents
- xml documents
- topic hierarchy
- latent topics
- pdf files
- user queries
- test collection
- printed documents
- text categorization
- text classifiers
- keyword extraction
- document relevance
- inverted index
- query expansion
- text mining
- maximal marginal relevance
- inter document similarities
- retrieval strategies
- pdf documents
- automatic text classification
- ranked list
- information extraction
- extensible markup language
- logical structure
- relevance ranking
- free text
- metadata