Login / Signup
Multi-Modal Perception Attention Network with Self-Supervised Learning for Audio-Visual Speaker Tracking.
Yidi Li
Hong Liu
Hao Tang
Published in:
CoRR (2021)
Keyphrases
</>
audio visual
multi modal
audio visual speech recognition
multi modality
speaker verification
multi stream
cross modal
visual data
uni modal
audio features