Adaptive Ensemble Self-Distillation With Consistent Gradients for Fast Inference of Pretrained Language Models.

Published in: IEEE ACM Trans. Audio Speech Lang. Process. (2024)

Keyphrases