Distilling Transformers into Simple Neural Networks with Unlabeled Transfer Data.
Subhabrata MukherjeeAhmed Hassan AwadallahPublished in: CoRR (2019)
Keyphrases
- neural network
- training data
- data sets
- original data
- computer systems
- database
- raw data
- missing data
- data analysis
- prior knowledge
- complex data
- data processing
- data mining techniques
- fuzzy logic
- artificial neural networks
- pattern recognition
- data quality
- high quality
- data collection
- labeled data
- statistical analysis
- machine learning
- neural network model
- decision trees
- experimental data
- data structure
- synthetic data
- data sources
- relational databases
- input data
- small number
- xml documents
- image data
- probability distribution