Decoder-only Architecture for Streaming End-to-end Speech Recognition.

Emiru Tsunoo Hayato Futami Yosuke Kashiwagi Siddhant Arora Shinji Watanabe

Published in: CoRR (2024)

Keyphrases

end to end
speech recognition
scalable video
rate allocation
language model
hidden markov models
rate adaptation
automatic speech recognition
real time
pattern recognition
speech signal
cross layer
speech synthesis
speech recognizer
noisy environments
video streaming
speech recognition systems
congestion control
data streams
transport layer
speaker identification
low complexity
speaker independent
speaker adaptation
speech recognition technology
application layer
multiple description coding
motion estimation
feature extraction