Login / Signup

Efficient Multiscale Multimodal Bottleneck Transformer for Audio-Video Classification.

Wentao Zhu
Published in: CoRR (2024)
Keyphrases
  • video classification
  • multiscale
  • multimedia
  • audio visual
  • information retrieval
  • multi modal
  • image segmentation
  • feature vectors
  • video indexing