Speech ReaLLM - Real-time Streaming Speech Recognition with Multimodal LLMs by Teaching the Flow of Time.
Frank SeideMorrie DoulatyYangyang ShiYashesh GaurJunteng JiaChunyang WuPublished in: CoRR (2024)
Keyphrases
- speech recognition
- real time streaming
- speech synthesis
- speech signal
- automatic speech recognition
- speech recognizer
- hidden markov models
- speech processing
- language model
- high speed networks
- pattern recognition
- multipath
- speech recognition systems
- speech recognition technology
- speaker identification
- speech recognition errors
- speaker independent
- keyword spotting
- noisy environments
- audio visual
- speaker dependent
- isolated word
- recognition engine
- multi modal
- speech recognizers
- e learning
- word error rate
- speech retrieval
- multimedia
- multi stream
- speaker adaptation
- cepstral coefficients
- computer assisted instruction
- database systems
- noisy speech