Keyphrases
- document classification
- neural network
- document collections
- document images
- document retrieval
- retrieval systems
- information retrieval systems
- information retrieval
- web documents
- vector space model
- euclidean distance
- text documents
- vector space
- database
- positive and negative
- document clustering
- random sampling
- distance measure
- feature vectors
- keywords
- linear constraints
- sampling algorithm
- document analysis
- vector representation
- ordered labeled trees
- distance metric
- sample size
- information extraction
- relevant documents
- structured documents
- distance function
- digital documents
- cf loadingtexthtml
- probabilistic model