HaViT: Hybrid-Attention Based Vision Transformer for Video Classification.

Li Li Liansheng Zhuang Shenghua Gao Shafei Wang

Published in: ACCV (4) (2022)

Keyphrases

video classification
computer vision
video clips
video content
category labels
video shots
visual features
metadata
multimedia
spatio temporal
humanoid robot