Information Retrieval in long documents: Word clustering approach for improving Semantics.
Paul Mbate MekontchouArmel FotsohBernabe BatchakuiEddy EllaPublished in: CoRR (2023)
Keyphrases
- information retrieval
- document collections
- document clustering
- text clustering
- information retrieval systems
- term weighting
- related documents
- relevant documents
- vector space model
- clustering algorithm
- retrieval systems
- document space
- document retrieval
- text mining
- spoken document retrieval
- semantic information
- word spotting
- k means
- text documents
- multiword
- co occurrence
- text collections
- concept space
- structured documents
- latent semantic indexing
- natural language text
- query terms
- text corpus
- clustering method
- sparck jones
- text retrieval
- test collection
- learning to rank
- n gram
- language model
- document representation
- keywords
- sentence level
- word frequencies
- spoken documents
- distributional clustering
- language modeling
- search engine
- semantic relationships
- tf idf
- information extraction
- question answering
- query expansion
- maximal marginal relevance
- term frequency
- word pairs
- word frequency
- user queries
- xml documents
- relevance feedback
- document corpus
- web documents
- retrieved documents
- vector space
- retrieval effectiveness
- latent topics
- document analysis
- latent semantic analysis