End-to-End Spatio-Temporal Action Localisation with Video Transformers.
Alexey A. GritsenkoXuehan XiongJosip DjolongaMostafa DehghaniChen SunMario LucicCordelia SchmidAnurag ArnabPublished in: CoRR (2023)
Keyphrases
- end to end
- human actions
- spatio temporal
- spatial and temporal
- scalable video
- action recognition
- space time
- video sequences
- spatial temporal
- wireless ad hoc networks
- human motion
- congestion control
- human activities
- multipath
- high bandwidth
- video data
- ad hoc networks
- real time
- admission control
- video streams
- video frames
- video content
- internet protocol
- content delivery
- visual features
- multimedia
- transport layer
- application layer
- low complexity
- key frames
- moving objects