Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation.
Jiawei HuangYi RenRongjie HuangDongchao YangZhenhui YeChen ZhangJinglin LiuXiang YinZejun MaZhou ZhaoPublished in: CoRR (2023)
Keyphrases
- text graphics
- multimedia
- visual data
- audio stream
- audio content
- human language
- audio video
- audio signals
- cross modal
- signal processing
- multi modal
- visual information
- video search
- spatio temporal
- broadcast news
- digital video
- audio visual
- temporal constraints
- spatial and temporal
- music information retrieval
- audio features
- emotion recognition
- temporal expressions
- text generation
- free text
- temporal reasoning
- music score