The CNN-Corpus: A Large Textual Corpus for Single-Document Extractive Summarization.
Rafael Dueire LinsHilário OliveiraLuciano de Souza CabralJamilson BatistaBruno Tenório ÁvilaRafael FerreiraRinaldo LimaGabriel de França Pereira e SilvaSteven J. SimskePublished in: DocEng (2019)
Keyphrases
- extractive summarization
- document corpus
- manually annotated
- textual features
- plain text
- text corpus
- keywords
- scientific papers
- document level
- metadata
- information retrieval systems
- similar documents
- text summarization
- document images
- cellular neural networks
- information retrieval
- training corpus
- word sense
- document structure
- noun phrases
- natural language
- document retrieval
- text documents