Serialized Output Training for End-to-End Overlapped Speech Recognition.
Naoyuki KandaYashesh GaurXiaofei WangZhong MengTakuya YoshiokaPublished in: CoRR (2020)
Keyphrases
- end to end
- speech recognition
- wall street journal corpus
- isolated word
- hidden markov models
- language model
- automatic speech recognition
- speech recognizer
- speech processing
- congestion control
- acoustic models
- speech signal
- speech synthesis
- speech recognition technology
- speech recognition systems
- speaker identification
- pattern recognition
- discriminative training
- noisy environments
- speech recognizers
- multimedia
- speaker diarization
- training process
- speech retrieval
- probabilistic model
- audio visual speech recognition
- face recognition