MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers.

Published in: NeurIPS (2020)

Keyphrases