Fused Acoustic and Text Encoding for Multimodal Bilingual Pretraining and Speech Translation.

Published in: ICML (2021)

Keyphrases