Clustering Documents with Active Learning Using Wikipedia.
Anna-Lan HuangDavid N. MilneEibe FrankIan H. WittenPublished in: ICDM (2008)
Keyphrases
- active learning
- document clustering
- text clustering
- clustering algorithm
- clustering method
- document collections
- k means
- information retrieval systems
- cosine similarity
- document retrieval
- document classification
- metadata
- hierarchical clustering
- relevance feedback
- xml documents
- data objects
- information retrieval
- relevant documents
- text documents
- document representation
- cluster analysis
- vector space
- random sampling
- information theoretic
- retrieval systems
- experimental design
- web documents
- supervised learning
- keywords
- learning algorithm
- selective sampling
- learning strategies
- data clustering
- data sets
- unsupervised learning
- text mining
- machine learning