Bridging the Gap: Using Deep Acoustic Representations to Learn Grounded Language from Percepts and Raw Speech.
Gaoussou Youssouf KebeLuke E. RichardsEdward RaffFrancis FerraroCynthia MatuszekPublished in: AAAI (2022)
Keyphrases
- vowel phonemes
- spoken language
- text to speech
- english text
- language acquisition
- text to speech synthesis
- speech sounds
- human language
- speech recognition
- natural language
- prosodic features
- programming language
- automatic speech recognition
- language generation
- speech recognizer
- speech recognition systems
- language processing
- language learning
- human communication
- speech signal
- acoustic features
- acoustic signal
- emotional speech
- spoken dialog systems
- high level
- source localization
- speech recognizers
- speaker recognition
- natural language processing