Self-Distillation for Further Pre-training of Transformers.

Seanie Lee Minki Kang Juho Lee Sung Ju Hwang Kenji Kawaguchi

Published in: ICLR (2023)

Keyphrases

training process
supervised learning
computer software
online learning
image processing
search algorithm
object recognition
training set
small number
data sets
virtual environment
training samples
serious games
artificial intelligence
training phase
feed forward neural networks
genetic algorithm