CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations.
Leying ZhangYao QianLong ZhouShujie LiuDongmei WangXiaofei WangMidia YousefiYanmin QianJinyu LiLei HeSheng ZhaoMichael ZengPublished in: CoRR (2024)
Keyphrases
- speech recognition
- conversational speech
- human communication
- human behavior
- recognition engine
- speech signal
- automatic speech recognition
- generation process
- noisy environments
- automatic speech recognition systems
- spontaneous speech
- vocal tract
- speaker identification
- spoken language
- real time
- object recognition
- multi agent
- learning algorithm