Multiple Speaker Tracking in Spatial Audio via PHD Filtering and Depth-Audio Fusion.
Qingju LiuWenwu WangTeofilo de CamposPhilip J. B. JacksonAdrian HiltonPublished in: IEEE Trans. Multim. (2018)
Keyphrases
- audio visual
- multimedia
- speaker identification
- prosodic features
- audio stream
- multimodal fusion
- signal processing
- visual information
- spatio temporal
- automatic transcription
- audio visual speech recognition
- speaker verification
- data fusion
- audio files
- spatial data
- pose estimation
- multi modal
- multimedia information
- audio features
- spatial and temporal
- audio signal
- kalman filter
- audio signals
- multi view