Login / Signup
Look Before you Speak: Visually Contextualized Utterances.
Paul Hongsuck Seo
Arsha Nagrani
Cordelia Schmid
Published in:
CoRR (2020)
Keyphrases
</>
natural language
search algorithm
real time
three dimensional
pairwise
feature vectors
visual representation