Login / Signup
Efficient Multiscale Multimodal Bottleneck Transformer for Audio-Video Classification.
Wentao Zhu
Published in:
CoRR (2024)
Keyphrases
</>
video classification
multiscale
multimedia
audio visual
information retrieval
multi modal
image segmentation
feature vectors
video indexing