Login / Signup

M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for Multilingual Speech to Image Retrieval.

Layne BerryYi-Jen ShihHsuan-Fu WangHeng-Jui ChangHung-Yi LeeDavid Harwath
Published in: ICASSP (2023)
Keyphrases
  • image retrieval
  • pre trained
  • probabilistic model
  • image database
  • speech recognition
  • statistical model
  • neural network
  • computer vision