Cross-speaker Emotion Transfer by Manipulating Speech Style Latents.
Suhee JoYounggun LeeYookyung ShinYeongtae HwangTaesu KimPublished in: CoRR (2023)
Keyphrases
- emotion recognition
- speech recognition
- audio visual
- speaker verification
- speaker recognition
- automatic speech recognition
- speaker identification
- text to speech synthesis
- prosodic features
- emotional state
- speaker dependent
- speaker diarization
- automatic speech recognition systems
- vocal tract
- speech signal
- poor quality
- emotional speech
- text to speech
- speech synthesis
- facial expressions
- automatic transcription
- speech recognizer
- synthesized speech
- hidden markov models
- acoustic features
- speech sounds
- audio stream
- gaussian mixture model
- noisy environments
- crime scene
- speaker independent
- spontaneous speech
- broadcast news
- acoustic models
- emotion classification
- spoken language
- human computer interaction
- data quality
- visual speech