Enhancing Audio Generation Diversity with Visual Information.
Zeyu XieBaihan LiXuenan XuMengyue WuKai YuPublished in: ICASSP (2024)
Keyphrases
- visual information
- visual data
- audio visual
- visual features
- low level
- visual content
- textual information
- visual cues
- human visual system
- semantic information
- eye movements
- visual information retrieval
- information extraction
- high level
- content based image retrieval systems
- semantic context
- image search
- visual descriptors
- content based image
- multimedia