Identifying High Quality Document-Summary Pairs through Text Matching.
Yongshuai HouYang XiangBuzhou TangQingcai ChenXiaolong WangFangze ZhuPublished in: Inf. (2017)
Keyphrases
- high quality
- automatic text summarization
- text summarization
- extractive summarization
- text documents
- document processing
- document analysis
- digital documents
- information retrieval
- document summarization
- keywords
- web documents
- document content
- document summaries
- string matching
- matching algorithm
- document images
- textual content
- text content
- textual documents
- text clustering
- text mining
- document corpus
- scientific documents
- text collections
- text retrieval
- document structure
- multimedia documents
- text corpus
- cross document
- retrieval engine
- automatic summarization
- keyword extraction
- latent semantic analysis
- document set
- pairwise
- syntactic analysis
- text classifiers
- document collections
- document categorization
- string similarity
- scientific papers
- information retrieval systems
- query biased
- multi document summarization
- document representation
- document level
- pattern matching
- similarity scores
- information extraction
- tf idf
- electronic documents
- handwritten text
- document clustering
- document retrieval
- word pairs
- semantic information
- relevant documents
- pdf files
- search engine
- relevance feedback
- natural language processing
- scanned documents
- related documents
- text categorization
- structured documents