MixCE: Training Autoregressive Language Models by Mixing Forward and Reverse Cross-Entropies.

Published in: ACL (1) (2023)

Keyphrases