SVT: Supertoken Video Transformer for Efficient Video Understanding.

Chenbin Pan Rui Hou Hanchao Yu Qifan Wang Senem Velipasalar Madian Khabsa

Published in: CoRR (2023)

Keyphrases

video sequences
video data
multimedia
video content
real time
video streams
space time
video analysis
digital video
video frames
spatial and temporal
video clips
event recognition
artificial intelligence
video database
surveillance videos
online video
visual data
video retrieval
computationally efficient
low level
expert systems
computer vision
neural network