nmT5 - Is parallel data still relevant for pre-training massively multilingual language models?

Published in: ACL/IJCNLP (2) (2021)

Keyphrases