Hypergeometric Language Model and Zipf-Like Scoring Function for Web Document Similarity Retrieval.
Felipe Bravo-MarquezGaston L'HuillierSebastián A. RíosJuan D. VelásquezPublished in: SPIRE (2010)
Keyphrases
- similarity retrieval
- scoring function
- web documents
- language model
- language modeling
- document retrieval
- ranking functions
- similarity search
- n gram
- information extraction
- web search engines
- information retrieval
- vector space model
- probabilistic model
- structure learning
- mixture model
- retrieval model
- web pages
- query expansion
- similarity queries
- test collection
- document length
- image retrieval
- smoothing methods
- query terms
- power law
- keywords
- web logs
- language modeling framework