Train Big, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers.

Published in: ICML (2020)

Keyphrases