A Corpus for Analyzing Text Reuse by People of Different Groups.
Waqas Arshad CheemaFahad NajibShakil AhmedSyed Husnain BukhariAbdul SittarRao Muhammad Adeel NawabPublished in: CLEF (Working Notes) (2015)
Keyphrases
- broad coverage
- open domain
- newspaper articles
- supervised machine learning
- text data
- sentence level
- natural language text
- text corpora
- multiword
- text corpus
- world knowledge
- plain text
- free text
- lexical features
- text mining
- text documents
- english words
- word pairs
- temporal expressions
- document corpus
- training corpus
- text retrieval
- anaphora resolution
- text collections
- scientific papers
- recognizing textual entailment
- spontaneous speech
- information extraction systems
- linguistic patterns
- textual features
- machine translation system
- document level
- software reuse
- computational linguistics
- noun phrases
- information retrieval
- word frequency
- syntactic features
- topic segmentation
- sentiment analysis
- semantic information
- database