From word embeddings to document similarities for improved information retrieval in software engineering.
Xin YeHui ShenXiao MaRazvan C. BunescuChang LiuPublished in: ICSE (2016)
Keyphrases
- information retrieval
- software engineering
- term weighting
- information retrieval systems
- document collections
- related documents
- spoken document retrieval
- vector space model
- retrieval systems
- relevant documents
- tf idf
- document retrieval
- term frequency
- retrieval model
- software systems
- test collection
- document space
- co occurrence
- text corpus
- keywords
- retrieval strategies
- vector space
- structured documents
- artificial intelligence
- text retrieval
- document clustering
- document images
- text mining
- related words
- text categorization
- word co occurrence
- search engine
- keyword extraction
- document relevance
- inverse document frequency
- maximal marginal relevance
- latent topics
- noun phrases
- semantic similarity
- object oriented
- word segmentation
- latent semantic analysis
- document representation
- language modeling
- document frequency
- retrieval effectiveness
- query terms
- question answering
- language model
- document corpus
- probabilistic model