Document Sensitivity Classification for Data Leakage Prevention with Twitter-Based Document Embedding and Query Expansion.
Lap Q. TrieuTrung-Nguyen TranMai-Khiem TranMinh-Triet TranPublished in: CIS (2017)
Keyphrases
- query expansion
- information retrieval systems
- relevant documents
- information retrieval
- document ranking
- document classification
- retrieved documents
- query terms
- trec genomics track
- model for information retrieval
- document level
- ad hoc retrieval
- ranking scheme
- query specific
- passage retrieval
- retrieval effectiveness
- retrieval systems
- document images
- language model
- document collections
- initial query
- relevance model
- text retrieval
- relevance feedback
- decision trees
- document retrieval
- language modeling
- language modeling framework
- text documents
- document clustering
- search engine
- text classification
- web documents
- probabilistic retrieval model
- feature extraction
- pseudo relevance feedback
- retrieval model
- user queries
- feature space
- web search
- pseudo feedback
- training set
- question answering
- term frequency
- machine learning
- sentiment analysis