Enhancing Video Transformers for Action Understanding with VLM-aided Training.
Hui LuHu JianRonald PoppeAlbert Ali SalahPublished in: CoRR (2024)
Keyphrases
- human actions
- video data
- video streams
- video content
- real time
- video sequences
- training set
- video database
- video clips
- video retrieval
- key frames
- online video
- video search
- video processing
- interactive video
- video analysis
- pre trained
- neural network
- temporal information
- spatial and temporal
- video frames
- action recognition
- training examples
- supervised learning
- multimedia
- computer vision
- learning algorithm
- training process
- video segmentation
- online learning
- real time video
- recognizing human actions