Sign in

HaViT: Hybrid-Attention Based Vision Transformer for Video Classification.

Li LiLiansheng ZhuangShenghua GaoShafei Wang
Published in: ACCV (4) (2022)
Keyphrases
  • video classification
  • computer vision
  • video clips
  • video content
  • category labels
  • video shots
  • visual features
  • metadata
  • multimedia
  • spatio temporal
  • humanoid robot