Hierarchically Classifying Documents Using Very Few Words.
Daphne KollerMehran SahamiPublished in: ICML (1997)
Keyphrases
- text documents
- pre classified
- word spotting
- automatic text classification
- document representation
- keywords
- word frequencies
- related words
- multiword
- index terms
- text corpus
- document content
- person names
- topic hierarchy
- word frequency
- training documents
- text corpora
- linguistic information
- xml documents
- document collections
- text classification
- textual features
- information retrieval
- web documents
- information retrieval systems
- natural language text
- topic models
- keyword extraction
- document retrieval
- semantic relationships
- latent topics
- word pairs
- n gram
- document space
- information extraction
- stop words
- text mining
- document clustering
- semantically related
- arabic documents
- word similarity
- vector space model
- retrieval systems
- printed documents
- hierarchical structure
- query terms
- part of speech
- bag of words
- word co occurrence
- document analysis
- automatic classification
- distributional clustering
- text categorization