Document classification through image-based character embedding and wildcard training.
Daiki ShimadaRyunosuke KotaniHitoshi IyatomiPublished in: IEEE BigData (2016)
Keyphrases
- document classification
- text classification
- classification algorithm
- topic extraction
- web documents
- text categorization
- text mining
- linear classification
- text documents
- web document classification
- training set
- automatic document classification
- supervised learning
- training examples
- information extraction
- object recognition
- information retrieval
- real world
- neural network
- image classification
- small number
- training samples
- image features
- naive bayes
- active learning
- multiscale
- tree patterns
- search engine