Self-Distillation for Further Pre-training of Transformers.

Seanie Lee Minki Kang Juho Lee Sung Ju Hwang Kenji Kawaguchi

Published in: CoRR (2022)

Keyphrases

training phase
artificial intelligence
training algorithm
online learning
artificial neural networks
active learning
feed forward neural networks
training process
test set
back propagation
training examples
supervised learning
probabilistic model
data sets
training set
multiscale
metadata
information systems
computer vision
genetic algorithm
information retrieval
machine learning