Login / Signup
Cross-Modal Transformers for Audio-Visual Person Verification.
Gnana Praveen Rajasekhar
Jahangir Alam
Published in:
Odyssey (2024)
Keyphrases
</>
audio visual
cross modal
multi modal
visual data
visual information
multimedia retrieval
multimedia data
high dimensional
visual recognition
image retrieval
contextual information
spatial information
computer vision
image annotation
human motion
video sequences
high level
multimedia