Sign in

CLIP4VideoCap: Rethinking Clip for Video Captioning with Multiscale Temporal Fusion and Commonsense Knowledge.

Tanvir MahmudFeng LiangYaling QingDiana Marculescu
Published in: ICASSP (2023)
Keyphrases