Sign in

AVE-CLIP: AudioCLIP-based Multi-window Temporal Transformer for Audio Visual Event Localization.

Tanvir MahmudDiana Marculescu
Published in: CoRR (2022)
Keyphrases