Login / Signup
M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for Multilingual Speech to Image Retrieval.
Layne Berry
Yi-Jen Shih
Hsuan-Fu Wang
Heng-Jui Chang
Hung-yi Lee
David Harwath
Published in:
CoRR (2022)
Keyphrases
</>
image retrieval
pre trained
data sets
neural network
image database
prior knowledge
image matching
speech signal