Seeing With Sound: Long-Range Acoustic Beamforming for Multimodal Scene Understanding.
Praneeth ChakravarthulaJim Aldon D'SouzaEthan TsengJoe BartusekFelix HeidePublished in: CVPR (2023)
Keyphrases
- long range
- scene understanding
- sound source
- audio visual
- short range
- object recognition
- vision system
- object detection
- d scene
- video surveillance
- scene categorization
- conditional random fields
- speech signal
- long range interactions
- computer vision
- long range correlations
- focus of attention
- machine learning
- object class
- information extraction
- information retrieval