PosMLP-Video: Spatial and Temporal Relative Position Encoding for Efficient Video Recognition.
Yanbin HaoDiansong ZhouZhicai WangChong-Wah NgoMeng WangPublished in: CoRR (2024)
Keyphrases
- spatial and temporal
- video frames
- space time
- spatio temporal
- spatial temporal
- temporal domain
- video sequences
- relative position
- video data
- dynamic textures
- object recognition
- temporal segmentation
- video content
- spatial and temporal information
- spatial context
- video analysis
- human activities
- geometric properties
- temporal correlation
- temporal dimension
- video retrieval
- low level