Visualizing Document Authorship Using n-grams and Latent Semantic Indexing.
Ian SoboroffCharles K. NicholasJames M. KuklaDavid S. EbertPublished in: Workshop on New Paradigms in Information Visualization and Manipulation (1997)
Keyphrases
- n gram
- latent semantic indexing
- document representation
- bag of words
- language model
- vector space model
- information retrieval
- web documents
- text retrieval
- document space
- language modeling
- text classification
- document collections
- document retrieval
- text documents
- latent semantic space
- singular value decomposition
- term frequency
- vector space
- retrieval model
- document images
- document clustering
- probabilistic model
- test collection
- part of speech
- query terms
- query expansion
- image classification
- least squares
- keywords
- data fusion
- machine learning