Text-to-Audio Generation Synchronized with Videos.
Shentong MoJing ShiYapeng TianPublished in: CoRR (2024)
Keyphrases
- text generation
- text graphics
- lecture videos
- natural language descriptions
- text to speech
- keywords
- video content analysis
- natural language generation
- video segments
- text retrieval
- video search
- multimedia
- cross media retrieval
- human language
- news video
- information retrieval
- video clips
- video indexing and retrieval
- video data
- audio visual
- media streams
- audio content
- text data
- video analysis
- video collections
- audio features
- video frames
- video content
- visual data
- visual information
- text documents
- video recordings
- video streams
- multi modal
- signal processing
- text mining
- human activities
- video sequences
- video surveillance