Multi-modal Transformer for Video Retrieval.
Valentin GabeurChen SunKarteek AlahariCordelia SchmidPublished in: ECCV (4) (2020)
Keyphrases
- multi modal
- video retrieval
- video search
- content based retrieval
- video database
- visual content
- video content
- semantic gap
- multi modality
- audio visual
- retrieval systems
- key frames
- image annotation
- video data
- semantic concepts
- concept detection
- high dimensional
- video clips
- humanoid robot
- video shots
- multimedia
- semantic video retrieval
- video streams
- automatic image annotation
- low level