Using Word Embeddings for Information Retrieval: How Collection and Term Normalization Choices Affect Performance.
Dwaipayan RoyDebasis GangulySumit BhatiaSrikanta BedathurMandar MitraPublished in: CIKM (2018)
Keyphrases
- term weighting
- information retrieval
- related documents
- document collections
- inverse document frequency
- term frequency
- trec collections
- tf idf
- vector space model
- term weights
- text retrieval
- text collections
- vector space
- information retrieval systems
- text categorization
- document space
- language modeling
- query terms
- retrieval model
- effective retrieval
- document representation
- test collection
- term weighting schemes
- term dependence
- retrieval systems
- learning to rank
- multiword
- manifold learning
- distributed information retrieval
- n gram
- low dimensional
- preprocessing
- keywords
- information extraction
- co occurrence
- term selection
- relevant documents
- language model
- query expansion
- spoken document retrieval
- language modeling framework
- queries and relevance judgments
- word segmentation
- latent semantic indexing
- document retrieval
- question answering
- distance measure
- high dimensional
- feature space
- natural language
- search engine