Synthesizing Near Native-accented Speech for a Non-native Speaker by Imitating the Pronunciation and Prosody of a Native Speaker.
Raymond ChungBrian MakPublished in: INTERSPEECH (2022)
Keyphrases
- speech recognition
- audio visual
- automatic speech recognition
- prosodic features
- speech synthesis
- speaker recognition
- speaker verification
- synthesized speech
- multi modal
- text to speech
- speech signal
- speaker identification
- speech recognizer
- spontaneous speech
- automatic speech recognition systems
- neural network
- speaker diarization
- gaussian mixture model
- feature selection