Collaborative Three-Stream Transformers for Video Captioning.
Hao WangLibo ZhangHeng FanTiejian LuoPublished in: CoRR (2023)
Keyphrases
- real time
- video streams
- video data
- video content
- multimedia
- video sequences
- data streams
- neural network
- video frames
- real time video
- collaborative learning
- spatial and temporal
- video retrieval
- video clips
- event recognition
- cooperative
- metadata
- computer vision
- human activities
- learning algorithm
- video surveillance
- video segmentation
- video processing
- audio stream