Unified Model for Image, Video, Audio and Language Tasks.
Mustafa ShukorCorentin DancetteAlexandre RaméMatthieu CordPublished in: CoRR (2023)
Keyphrases
- unified model
- visual data
- multimedia
- video files
- image data
- image retrieval
- image features
- input image
- image classification
- audio video
- image content
- multiscale
- image collections
- single image
- image analysis
- natural language
- image frames
- low level
- high resolution
- video sequences
- video images
- video analysis
- key frames
- video data
- visual cues
- static images
- weakly labeled
- scene change detection
- visual information
- image segmentation
- edge detection
- signal processing
- feature points
- image representation
- image regions
- segmentation algorithm
- segmentation method
- visual features
- multimedia information
- spatial information
- audio signals
- video content
- programming language
- image sequences
- test images