Serialized Output Training for End-to-End Overlapped Speech Recognition.
Naoyuki KandaYashesh GaurXiaofei WangZhong MengTakuya YoshiokaPublished in: INTERSPEECH (2020)
Keyphrases
- end to end
- speech recognition
- wall street journal corpus
- isolated word
- hidden markov models
- language model
- speech processing
- acoustic models
- speech recognition technology
- speech recognizer
- speech synthesis
- speech signal
- automatic speech recognition
- pattern recognition
- congestion control
- speaker independent
- noisy environments
- speaker identification
- training process
- speech recognizers
- feature vectors
- neural network
- speech retrieval
- audio visual speech recognition