When the elevator pitch meets the subject heading: How mixtures of other documents can describe what a document is about.
Peter OrganisciakMichael B. TwidalePublished in: ASIST (2014)
Keyphrases
- document collections
- relevant documents
- document classification
- web documents
- text documents
- document clustering
- digital documents
- information retrieval
- electronic documents
- semi structured documents
- document content
- information retrieval systems
- document retrieval
- document processing
- document representation
- retrieval systems
- structured documents
- vector space model
- document type
- document analysis
- document ranking
- document similarity
- document structure
- document repository
- retrieved documents
- keywords
- textual content
- unstructured documents
- document summarization
- training documents
- document set
- similar documents
- document centric
- index terms
- scientific documents
- document level
- xml format
- user queries
- text collections
- multimedia documents
- xml documents
- digital libraries
- text categorization
- printed documents
- query biased
- topic hierarchy
- document relevance
- scanned documents
- related documents
- ranked list
- query terms
- text classifiers
- retrieval strategies
- document images
- maximal marginal relevance
- document archives
- control system
- tf idf
- latent semantic analysis
- textual documents
- logical structure
- text mining
- document corpus
- semantic information
- query expansion
- pdf documents
- cross references
- text classification
- pdf files
- relevance feedback
- relevance model
- term frequency