TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation.
Chenyang LeYao QianDongmei WangLong ZhouShujie LiuXiaofei WangMidia YousefiYanmin QianJinyu LiSheng ZhaoMichael ZengPublished in: CoRR (2024)
Keyphrases
- text to speech
- speech recognition
- speech synthesis
- emotion recognition
- speech recognition errors
- speech signal
- speech quality
- prosodic features
- speaker recognition
- voice activity detection
- endpoint detection
- speaker identification
- probabilistic model
- speaker verification
- broadcast news
- audio visual
- english text
- spontaneous speech
- recognition engine
- noisy environments
- automatic speech recognition
- question answering
- text to speech synthesis
- multimedia
- automatic speech recognition systems