Investigating Modality Bias in Audio Visual Video Parsing.
Piyush Singh PasiShubham NemaniPreethi JyothiGanesh RamakrishnanPublished in: CoRR (2022)
Keyphrases
- audio visual
- multi modal
- video summarization
- visual data
- meeting room
- multimedia
- audio visual content
- audio features
- sports video
- temporal context
- video data
- visual information
- multi stream
- multimodal fusion
- video sequences
- video frames
- natural language processing
- video content
- person authentication
- audio visual speech recognition
- metadata
- high dimensional
- domain knowledge
- low level
- human actions
- key frames