Sound Source Localization is All about Cross-Modal Alignment.
Arda SenocakHyeonggon RyuJunsik KimTae-Hyun OhHanspeter PfisterJoon Son ChungPublished in: CoRR (2023)
Keyphrases
- cross modal
- sound source
- source localization
- localization algorithm
- multi modal
- audio visual
- multimedia retrieval
- image retrieval
- speech signal
- visual data
- multimedia databases
- visual recognition
- visual similarity
- real environment
- focus of attention
- multimedia data
- image classification
- image database
- speech recognition
- machine learning
- object recognition