Vector Space Model and Overlap Metric for Author Identification Notebook for PAN at CLEF 2013.
Arun JayapalBinayak GoswamiPublished in: CLEF (Working Notes) (2013)
Keyphrases
- vector space model
- author identification
- information retrieval
- retrieval model
- query expansion
- test collection
- language model
- question answering
- semantic similarity
- web documents
- vector space
- highly skewed
- information retrieval systems
- cross language
- cross lingual
- document clustering
- language modeling
- search engine
- distance measure
- document retrieval
- text mining
- domain knowledge
- semantic information
- text retrieval
- document collections
- text categorization
- text classification
- language independent
- knowledge discovery