Fast-Slow Transformer for Visually Grounding Speech.
Puyuan PengDavid HarwathPublished in: ICASSP (2022)
Keyphrases
- speech recognition
- speech signal
- speech synthesis
- fault diagnosis
- audio visual
- fuzzy logic
- text to speech
- spoken language
- automatic speech recognition
- speaker recognition
- language acquisition
- spontaneous speech
- vocal tract
- speech processing
- multi lingual
- broadcast news
- artificial intelligence
- case study
- spoken dialogue systems
- dialogue system
- audio signals
- power system
- human computer interaction
- speech recognizer
- pattern recognition