In-the-wild Speech Emotion Conversion Using Disentangled Self-Supervised Representations and Neural Vocoder-based Resynthesis.
Navin Raj PrabhuNale Lehmann-WillenbrockTimo GerkmannPublished in: CoRR (2023)
Keyphrases
- emotion recognition
- emotional state
- text to speech synthesis
- speech recognition
- distributed representations
- network architecture
- emotional speech
- neural network
- facial expressions
- bio inspired
- endpoint detection
- speech signal
- speaker identification
- speech synthesis
- interactive computer
- emotion classification
- recognition engine
- neural model
- neural fuzzy
- speaker diarization
- speaker recognition
- audio visual
- higher level
- multi modal
- pattern recognition