Zero Shot Audio to Audio Emotion Transfer With Speaker Disentanglement.
Soumya DuttaSriram GanapathyPublished in: CoRR (2024)
Keyphrases
- automatic speech recognition
- acoustic features
- speech recognition
- speaker identification
- speech signal
- hidden markov models
- emotion recognition
- audio visual
- multimedia
- visual information
- audio signals
- multimodal fusion
- audio video
- signal processing
- audio stream
- speaker verification
- facial expressions
- pattern recognition
- machine learning
- digital video
- gaussian mixture model
- text classification
- neural network