AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation.
Efthymios TzinisScott WisdomTal RemezJohn R. HersheyPublished in: CoRR (2022)
Keyphrases
- visual attention
- open domain
- sound source
- visual field
- focus of attention
- eye movements
- eye tracking
- saliency map
- information extraction
- audio visual
- vision system
- visual search
- question answering
- visual information
- higher level
- multimedia
- eye movement data
- visual attention model
- stereo camera
- object based visual attention
- visual saliency
- low level
- question answering systems
- salient regions
- eye tracking data
- visual data
- image quality