Video and audio are images: A cross-modal mixer for original data on video-audio retrieval.
Zichen YuanQi ShenBingyi ZhengYuting LiuLinying JiangGuibing GuoPublished in: Knowl. Based Syst. (2024)
Keyphrases
- cross modal
- visual data
- original data
- image retrieval
- multimedia
- visual similarity
- multiple modalities
- high dimensional data
- multi modal
- video data
- image database
- visual information
- perceptual information
- multimedia retrieval
- visual features
- semantic concepts
- video sequences
- multimedia databases
- video streams
- image data
- video analysis
- raw data
- input data
- video content
- image features
- multimedia data
- visual content
- video frames
- input image
- test images
- information retrieval
- image sequences
- visual concepts
- visual recognition
- web images
- high dimensional
- low level
- content based retrieval
- data sets
- data analysis
- image regions
- image collections
- object recognition
- pattern recognition
- similarity measure
- key frames