On Attention Modules for Audio-Visual Synchronization.
Naji KhosravanShervin ArdeshirRohit PuriPublished in: CoRR (2018)
Keyphrases
- audio visual
- multi stream
- multi modal
- audio visual speech recognition
- visual information
- visual data
- temporal context
- emotion recognition
- person authentication
- video summarization
- multimedia
- hidden markov models
- multimodal fusion
- three dimensional
- text classification
- co occurrence
- natural language processing
- spatio temporal