Sign in
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers.
Wenhui Wang
Furu Wei
Li Dong
Hangbo Bao
Nan Yang
Ming Zhou
Published in:
CoRR (2020)
Keyphrases
</>
pre trained
training data
training examples
computer vision
real time
image sequences
reinforcement learning
pairwise
active learning
d objects
control signals