Knowledge Distillation as Efficient Pre-training: Faster Convergence, Higher Data-efficiency, and Better Transferability.
Ruifei HeShuyang SunJihan YangSong BaiXiaojuan QiPublished in: CoRR (2022)
Keyphrases
- raw data
- data sets
- data mining techniques
- data quality
- data collection
- data structure
- image data
- prior knowledge
- data sources
- probability distribution
- training examples
- data processing
- knowledge discovery
- data points
- database
- data analysis
- domain knowledge
- expert systems
- input data
- knowledge acquisition
- high quality
- faster convergence
- data model
- xml documents
- evolutionary algorithm
- multiscale
- high dimensional data
- missing data
- background knowledge
- knowledge base
- learning algorithm