Usted: Improving ASR with a Unified Speech and Text Encoder-Decoder.
Bolaji YusufAnkur GandheAlex SokolovPublished in: ICASSP (2022)
Keyphrases
- automatic speech recognition
- spontaneous speech
- speech recognition
- low complexity
- video codec
- text to speech synthesis
- speech signal
- distributed video coding
- conversational speech
- decoding process
- text to speech
- spoken words
- error control
- text recognition
- human machine interaction
- text input
- english text
- noisy channel
- motion estimation
- wyner ziv video coding
- bit rate
- mpeg avc
- rate distortion
- wyner ziv
- successive approximation
- spoken language
- video coding
- temporal correlation
- hidden markov models
- lexical features
- distributed source coding
- word error rate
- information retrieval
- video coder
- noisy environments
- video coding scheme
- speech synthesis
- broadcast news
- speech retrieval
- video compression
- motion compensated prediction
- spoken document retrieval
- rate allocation
- motion compensation
- low bit rate
- video transcoding
- video sequences