Fusion of audio and video modalities for detection of acoustic events.
Taras ButkoAndrey TemkoCliment NadeuCristian Canton-FerrerPublished in: INTERSPEECH (2008)
Keyphrases
- soccer video
- multimodal fusion
- event detection
- visual data
- video analysis
- multi modal fusion
- video data
- multimedia
- video streams
- event recognition
- decision level fusion
- video clips
- audio video
- video content analysis
- single modality
- multi modal
- video scene
- audio visual
- sports video
- video content
- video event
- audio signal
- digital video
- face detection and tracking
- audio features
- human activities
- scene change detection
- cross modal
- multimedia processing
- video recordings
- video annotation
- surveillance videos
- shot boundary detection
- video files
- video sequences
- long video
- video frames
- mouth region
- multi modality
- key frames
- multimedia information
- acoustic features
- video surveillance
- detection algorithm
- audio files
- object detection
- video database
- temporal information
- media streams
- image fusion
- image sequences
- hidden markov models
- multiple modalities
- fusion method
- information fusion