TwiSty: A Multilingual Twitter Stylometry Corpus for Gender and Personality Profiling.
Ben VerhoevenWalter DaelemansBarbara PlankPublished in: LREC (2016)
Keyphrases
- social media
- social networks
- parallel corpus
- online social networks
- language independent
- social networking
- wide coverage
- manually annotated
- multi lingual
- open domain
- social behavior
- chinese english
- cross language
- digital libraries
- comparable corpora
- information retrieval
- personality types
- personality traits
- topic detection
- machine translation system
- document level
- text corpora
- cross language information retrieval
- text data
- cross lingual