Audio-Visual Salieny Network with Audio Attention Module.
Shuaiyang ChengXing GaoLiang SongJianbing XiahouPublished in: ICAIIS (2021)
Keyphrases
- audio visual
- multi modal
- visual information
- visual data
- temporal context
- audio visual speech recognition
- multi stream
- audio features
- speaker verification
- audio visual content
- emotion recognition
- multimedia
- multimodal fusion
- person authentication
- image content
- dimensionality reduction
- spatio temporal
- high dimensional
- three dimensional
- data sets