Multimodal Pretraining for Dense Video Captioning.
Gabriel HuangBo PangZhenhai ZhuClara RiveraRadu SoricutPublished in: CoRR (2020)
Keyphrases
- multimedia
- video data
- video content
- video streams
- video frames
- multi modal
- video sequences
- space time
- real time
- video retrieval
- story segmentation
- spatial and temporal
- digital video
- multimodal information
- online video
- real time video
- video processing
- video database
- spatio temporal
- video segmentation
- video shots
- video analysis
- video images
- dynamic scenes
- video clips
- multi view