Unified Video-Language Pre-training with Synchronized Audio.
Shentong MoHaofan WangHuaxia LiXu TangPublished in: CoRR (2024)
Keyphrases
- multimedia
- media streams
- audio video
- video content analysis
- scene change detection
- video data
- multimedia processing
- digital video
- video files
- video analysis
- visual data
- multimedia information
- training set
- video streams
- audio stream
- programming language
- audio files
- video sequences
- audio signals
- language learning
- visual information
- video copy detection
- human language
- signal processing
- digital audio
- real time
- video database
- video content
- video frames
- video material
- closed captions
- sports video
- lecture videos
- audio signal
- broadcast news
- training process
- video surveillance
- natural language
- online video
- content based video retrieval
- video recordings
- soccer video
- video search
- audio content
- audio visual
- metadata