Login / Signup
Spiking Tucker Fusion Transformer for Audio-Visual Zero-Shot Learning.
Wenrui Li
Penghong Wang
Ruiqin Xiong
Xiaopeng Fan
Published in:
CoRR (2024)
Keyphrases
</>
audio visual
multimodal fusion
person authentication
multi modal
visual information
emotion recognition
visual data
multimedia
speaker verification
multi stream
temporal context
information fusion
data sets
neural network
databases