AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation.
Efthymios TzinisScott WisdomTal RemezJohn R. HersheyPublished in: ECCV (37) (2022)
Keyphrases
- visual attention
- open domain
- sound source
- visual field
- focus of attention
- eye movements
- audio visual
- saliency map
- information extraction
- eye tracking
- vision system
- visual information
- eye movement data
- multimedia
- visual search
- visual attention model
- question answering
- higher level
- attention mechanism
- visual data
- stereo camera
- real time
- multi modal
- object based visual attention
- visual saliency
- multiscale