Serialized Output Training for End-to-End Overlapped Speech Recognition.

Naoyuki Kanda Yashesh Gaur Xiaofei Wang Zhong Meng Takuya Yoshioka

Published in: INTERSPEECH (2020)

Keyphrases

end to end
speech recognition
wall street journal corpus
isolated word
hidden markov models
language model
speech processing
acoustic models
speech recognition technology
speech recognizer
speech synthesis
speech signal
automatic speech recognition
pattern recognition
congestion control
speaker independent
noisy environments
speaker identification
training process
speech recognizers
feature vectors
neural network
speech retrieval
audio visual speech recognition