AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head.
Rongjie HuangMingze LiDongchao YangJiatong ShiXuankai ChangZhenhui YeYuning WuZhiqing HongJiawei HuangJinglin LiuYi RenZhou ZhaoShinji WatanabePublished in: CoRR (2023)
Keyphrases
- audio signals
- acoustic features
- audio signal
- speech recognition
- automatic speech recognition systems
- speech music discrimination
- real time
- music information retrieval
- audio features
- automatic speech recognition
- audio content
- generation process
- audio visual
- audio recordings
- facial expressions
- facial animation
- text to speech
- speech signal
- hand movements
- music composition
- multi modal