Automatic Speech Disentanglement for Voice Conversion using Rank Module and Speech Augmentation.
Zhonghua LiuShijun WangNing ChenPublished in: CoRR (2023)
Keyphrases
- text to speech
- speech recognition
- speech signal
- speech synthesis
- emotion recognition
- speech quality
- audio visual
- fully automatic
- voice activity detection
- prosodic features
- recognition engine
- speech music discrimination
- speech recognition errors
- dialogue system
- broadcast news
- speaker recognition
- speaker identification
- speech sounds
- speech processing
- multi lingual
- spoken language
- automatic speech recognition
- data sets
- fundamental frequency
- gaussian mixture model
- semi automatic
- multi modal
- information retrieval