Mitigating Vocabulary Mismatch on Multi-domain Corpus using Word Embeddings and Thesaurus.
Nagesh YadavAlessandro Di BariMiao WeiJohn Segrave-DalyConor CullenDenisa MogaJillian ScalviniCiaran HennessyMorten KristiansenOmar O'SullivanPublished in: ICAART (1) (2020)
Keyphrases
- multi domain
- vocabulary mismatch
- keywords
- query expansion
- sentence level
- text segmentation
- domain specific
- cross domain
- training corpus
- sentence retrieval
- information retrieval
- language model
- sentiment analysis
- information retrieval systems
- sentiment classification
- n gram
- text documents
- co occurrence
- digital libraries
- pseudo relevance feedback
- novelty detection
- general purpose
- relevance feedback
- text retrieval
- heterogeneous networks
- language modeling
- retrieval effectiveness
- vector space
- search engine
- query translation
- multi document summarization
- word segmentation
- k nearest neighbor
- knn