Adapting Speech Separation to Real-World Meetings Using Mixture Invariant Training.
Aswin SivaramanScott WisdomHakan ErdoganJohn R. HersheyPublished in: CoRR (2021)
Keyphrases
- real world
- hearing impaired
- audio visual
- speaker diarization
- wide range
- data sets
- case study
- training set
- synthetic data
- training phase
- multi modal
- training samples
- training process
- speech recognition
- training data
- neural network
- online learning
- test set
- supervised learning
- training algorithm
- natural language
- mixture model
- em algorithm
- gaussian distribution
- support vector machine
- feature vectors
- speech signal
- speech synthesis
- blind separation
- endpoint detection