Within-Document Term-Based Index Pruning with Statistical Hypothesis Testing.
Sree Lekha ThotaBen CarterettePublished in: ECIR (2011)
Keyphrases
- statistical hypothesis testing
- document identifiers
- inverted lists
- index terms
- inverted index
- posting lists
- document representation
- indexing scheme
- inverted file
- sample size
- query terms
- information retrieval
- search space
- information retrieval systems
- document clustering
- query processing
- term weighting schemes
- term weights
- database
- document type
- document collections
- web documents
- cosine similarity
- index structure
- pruning method
- term frequency
- term dependence
- vector space model
- document retrieval
- term weighting
- signature file
- document images
- retrieval systems
- pruning power
- b tree
- maximal marginal relevance