CLiPS Stylometry Investigation (CSI) corpus: A Dutch corpus for the detection of age, gender, personality, sentiment and deception in text.
Ben VerhoevenWalter DaelemansPublished in: LREC (2014)
Keyphrases
- sentence level
- broad coverage
- open domain
- supervised machine learning
- sentiment analysis
- document level
- text corpus
- plain text
- text data
- newspaper articles
- text corpora
- linguistic features
- english words
- training corpus
- manually annotated
- positive or negative
- multiword
- lexical features
- text mining
- scientific papers
- topic segmentation
- document corpus
- natural language text
- opinion mining
- sentiment classification
- free text
- information retrieval
- linguistic information
- text collections
- text documents
- language model
- named entity disambiguation