End-to-end Generative Pretraining for Multimodal Video Captioning.
Paul Hongsuck SeoArsha NagraniAnurag ArnabCordelia SchmidPublished in: CVPR (2022)
Keyphrases
- end to end
- scalable video
- video streams
- multimedia
- admission control
- video content
- video sequences
- wireless ad hoc networks
- video data
- congestion control
- real time
- multipath
- ad hoc networks
- video frames
- internet protocol
- transport protocol
- rate allocation
- application layer
- compressed video
- digital video
- multimedia data
- video on demand
- image quality
- real world