MLSUM: The Multilingual Summarization Corpus.
Thomas ScialomPaul-Alexis DraySylvain LamprierBenjamin PiwowarskiJacopo StaianoPublished in: EMNLP (1) (2020)
Keyphrases
- topic segmentation
- parallel corpus
- language independent
- automatic summarization
- wide coverage
- chinese english
- manually annotated
- cross language information retrieval
- digital libraries
- open domain
- cross lingual
- comparable corpora
- test set
- information extraction
- video summarization
- video search
- multi document summarization
- machine learning