CroCoSum: A Benchmark Dataset for Cross-Lingual Code-Switched Summarization.
Ruochen ZhangCarsten EickhoffPublished in: LREC/COLING (2024)
Keyphrases
- benchmark datasets
- cross lingual
- machine translation
- cross lingual information retrieval
- language modeling
- event extraction
- cross language
- language independent
- text classification
- translation model
- document clustering
- parallel corpus
- parallel corpora
- machine learning
- machine translation system
- word sense
- news articles
- transfer learning
- query translation
- natural language processing
- n gram