Sign in

MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers.

Wenhui WangHangbo BaoShaohan HuangLi DongFuru Wei
Published in: ACL/IJCNLP (Findings) (2021)
Keyphrases
  • visual attention
  • data compression
  • data sets
  • databases
  • data structure
  • viewpoint
  • database
  • real world
  • image sequences
  • face detection
  • focus of attention