CodE Alltag 2.0 - A Pseudonymized German-Language Email Corpus.
Elisabeth EderUlrike Krieg-HolzUdo HahnPublished in: LREC (2020)
Keyphrases
- spam filtering
- programming language
- natural language
- speech acts
- enron email
- text mining
- instant messaging
- source code
- email messages
- spanish language
- parallel corpus
- spam detection
- manually annotated
- object oriented programming
- domain specific track
- language learning
- word forms
- communication medium
- programs written
- high level programming languages
- machine learning
- electronic mail
- spam filters
- machine translation system
- communication tools
- language processing
- social networks