VoiceShop: A Unified Speech-to-Speech Framework for Identity-Preserving Zero-Shot Voice Editing.
Philip AnastassiouZhenyu TangKainan PengDongya JiaJiaxin LiMing TuYuping WangYuxuan WangMingbo MaPublished in: CoRR (2024)
Keyphrases
- speech recognition
- text to speech
- main contribution
- emotion recognition
- speech signal
- automatic speech recognition
- recognition engine
- speech synthesis
- audio visual
- unified model
- spoken language
- speaker recognition
- automatic speech recognition systems
- speech quality
- speech recognition errors
- endpoint detection
- fundamental frequency
- image processing
- vocal tract
- conceptual framework
- lightweight
- case study
- multimedia