CIF-RNNT: Streaming ASR Via Acoustic Word Embeddings with Continuous Integrate-and-Fire and RNN-Transducers.
Wen Shen TeoYasuhiro MinamiPublished in: ICASSP (2024)
Keyphrases
- automatic speech recognition
- speech recognizers
- speech recognition
- speech recognition systems
- recurrent neural networks
- spontaneous speech
- prosodic features
- speech sounds
- continuous query processing
- nearest neighbor
- co occurrence
- word recognition
- data streams
- recognition errors
- real time
- finite automata
- information retrieval
- word error rate
- spoken language
- acoustic features
- streaming data
- dimensionality reduction
- speech recognizer
- n gram
- video streaming
- vector space
- word level
- spoken document retrieval
- continuous functions
- source localization
- distance measure
- binary codes
- conversational speech
- speech signal
- acoustic signal
- keywords
- multimedia