ViLaS: Exploring the Effects of Vision and Language Context in Automatic Speech Recognition.
Ziyi NiMinglun HanFeilong ChenLinghui MengJing ShiPin LvBo XuPublished in: ICASSP (2024)
Keyphrases
- automatic speech recognition
- speech recognition
- speech retrieval
- context dependent
- speech signal
- natural language
- computer vision
- hidden markov models
- context aware
- conversational speech
- machine learning
- noisy environments
- word error rate
- spoken words
- spontaneous speech
- broadcast news
- word recognition
- vision system
- pattern recognition
- information retrieval