Login / Signup
M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for Multilingual Speech to Image Retrieval.
Layne Berry
Yi-Jen Shih
Hsuan-Fu Wang
Heng-Jui Chang
Hung-Yi Lee
David Harwath
Published in:
ICASSP (2023)
Keyphrases
</>
image retrieval
pre trained
probabilistic model
image database
speech recognition
statistical model
neural network
computer vision