Leveraging Pretrained Image-text Models for Improving Audio-Visual Learning.

Published in: CoRR (2023)

Keyphrases