A Fine-tuned Wav2vec 2.0/HuBERT Benchmark For Speech Emotion Recognition, Speaker Verification and Spoken Language Understanding.
Yingzhi WangAbdelmoumene BoumadaneAbdelwahab HebaPublished in: CoRR (2021)
Keyphrases
- language understanding
- speaker verification
- fine tuned
- speech emotion recognition
- natural language understanding
- fine tuning
- noisy environments
- dialogue management
- language processing
- audio visual
- semantic interpretation
- domain specific
- dialogue system
- spoken dialogue systems
- language identification
- emotion recognition
- multilayer perceptron
- natural language
- cognitive psychology
- general knowledge
- knowledge representation
- using artificial neural networks
- multi modal
- neural network
- hidden markov models
- semantic analysis
- learning algorithm