Multimodal Speech Recognition for Language-Guided Embodied Agents.
Allen ChangXiaoyuan ZhuAarav MongaSeoho AhnTejas SrinivasanJesse ThomasonPublished in: CoRR (2023)
Keyphrases
- speech recognition
- embodied agents
- isolated word
- hidden markov models
- speech signal
- noisy environments
- speech synthesis
- automatic speech recognition
- language model
- speech processing
- audio visual speech recognition
- speech recognizer
- multi stream
- speech recognition technology
- pattern recognition
- speaker identification
- speech recognition systems
- language learning
- multi modal
- speaker independent
- virtual humans
- language processing
- machine learning
- non stationary
- three dimensional