COUNTER: corpus of Urdu news text reuse.
Muhammad SharjeelRao Muhammad Adeel NawabPaul RaysonPublished in: Lang. Resour. Evaluation (2017)
Keyphrases
- topic tracking
- broad coverage
- keywords
- sentence level
- news stories
- news articles
- supervised machine learning
- open domain
- sentiment analysis
- newspaper articles
- scientific papers
- text data
- natural language text
- text processing
- news corpus
- plain text
- news video
- text corpora
- person names
- text collections
- short texts
- free text
- topic segmentation
- text corpus
- information retrieval
- language identification
- multi lingual
- keyword extraction
- english words
- anaphora resolution
- financial news
- named entity disambiguation
- cross media
- training corpus
- lexical features
- noun phrases
- social media
- document corpus
- text summarization
- multiword
- natural language processing
- text documents
- text retrieval
- textual content
- information extraction
- topic detection and tracking
- automatic summarization
- recognizing textual entailment
- word sense
- news sources
- user comments
- linguistic patterns
- language model
- linguistic information
- topic models
- spontaneous speech
- video search
- learning objects
- writing style
- lexical chains
- document level
- world knowledge
- online news
- short text