Do You Remember? Dense Video Captioning with Cross-Modal Memory Retrieval.
Minkuk KimHyeon Bae KimJinyoung MoonJinwoo ChoiSeong Tae KimPublished in: CoRR (2024)
Keyphrases
- cross modal
- multi modal
- visual data
- multimedia retrieval
- image retrieval
- multimedia
- video data
- multimedia databases
- semantic concepts
- visual similarity
- video sequences
- visual recognition
- multimedia data
- multimedia information retrieval
- video content
- computer vision
- image database
- video analysis
- visual concepts
- perceptual information
- text retrieval
- video streams
- video frames
- space time
- visual features
- image content
- multimedia documents
- image classification
- information retrieval systems
- feature vectors
- object recognition
- search engine