Audio-Visual Person Verification based on Recursive Fusion of Joint Cross-Attention.
R. Gnana PraveenJahangir AlamPublished in: CoRR (2024)
Keyphrases
- audio visual
- multimodal fusion
- person authentication
- multi modal
- visual information
- multimodal biometrics
- visual data
- multimedia
- temporal context
- emotion recognition
- audio visual speech recognition
- multi stream
- video summarization
- information fusion
- low level
- video retrieval
- image representation
- computer vision
- search engine