Multitask Detection of Speaker Changes, Overlapping Speech and Voice Activity Using wav2vec 2.0.
Marie KunesováZbynek ZajícPublished in: CoRR (2022)
Keyphrases
- multi task
- speech recognition
- voice activity detection
- synthesized speech
- prosodic features
- speech sounds
- audio visual
- text to speech
- speaker verification
- multitask learning
- speaker recognition
- speech synthesis
- speaker identification
- automatic speech recognition
- noisy environments
- multi task learning
- learning tasks
- speech signal
- mel frequency cepstral coefficients
- multi class
- speaker dependent
- transfer learning
- gaussian processes
- machine learning
- feature selection
- prior knowledge
- speech quality
- hidden markov models
- emotion recognition
- speaker diarization
- acoustic features
- feature set
- learning problems
- multiple tasks
- data sets