Login / Signup
AVSegFormer: Audio-Visual Segmentation with Transformer.
Shengyi Gao
Zhe Chen
Guo Chen
Wenhai Wang
Tong Lu
Published in:
AAAI (2024)
Keyphrases
</>
audio visual
multi modal
temporal segmentation
visual information
visual data
emotion recognition
temporal context
multimedia
person authentication
image segmentation
multi stream
video summarization
multiscale
audio visual speech recognition
visual content
image data
pattern recognition
multimodal fusion