DEPLAIN: A German Parallel Corpus with Intralingual Translations into Plain Language for Sentence and Document Simplification.
Regina StoddenOmar MomenLaura KallmeyerPublished in: CoRR (2023)
Keyphrases
- parallel corpus
- machine translation
- source language
- query translation
- machine translation system
- parallel texts
- word alignment
- target language
- cross lingual
- cross language
- cross language information retrieval
- sentence pairs
- language independent
- document clustering
- document classification
- bilingual dictionaries
- document collections
- document retrieval
- word level
- statistical machine translation
- latent semantic analysis
- natural language
- parallel corpora
- information retrieval systems
- natural language processing
- information retrieval
- information extraction
- query terms
- text documents
- semantic space
- relevant documents
- query expansion
- web documents
- user queries
- text summarization
- text classification
- document representation
- vector space model
- semantic information
- keywords
- feature selection