CroCoSum: A Benchmark Dataset for Cross-Lingual Code-Switched Summarization.
Ruochen ZhangCarsten EickhoffPublished in: CoRR (2023)
Keyphrases
- benchmark datasets
- cross lingual
- machine translation
- language modeling
- cross lingual information retrieval
- cross language
- language independent
- event extraction
- text classification
- parallel corpus
- transfer learning
- multi document summarization
- news articles
- machine learning
- parallel corpora
- translation model
- document clustering
- text mining
- text summarization
- query translation
- natural language
- probabilistic model
- topic models