Login / Signup
Pano-AVQA: Grounded Audio-Visual Question Answering on 360° Videos.
Heeseung Yun
Youngjae Yu
Wonsuk Yang
Kangil Lee
Gunhee Kim
Published in:
ICCV (2021)
Keyphrases
</>
audio visual
question answering
passage retrieval
multi modal
visual data
visual information
natural language processing
video sequences
natural language
multimedia
information extraction
information retrieval
video frames
video search
human activities
video content
text mining
named entities
machine learning