Login / Signup
Audio-Visual Grounding Referring Expression for Robotic Manipulation.
Yefei Wang
Kaili Wang
Yi Wang
Di Guo
Huaping Liu
Fuchun Sun
Published in:
ICRA (2022)
Keyphrases
</>
audio visual
manipulation tasks
multi modal
visual information
visual data
multimedia
multi stream
temporal context
emotion recognition
person authentication
video summarization
audio visual speech recognition
multimodal fusion
human activities
spatio temporal
object recognition
computer vision