Visual Hallucination Elevates Speech Recognition.
Fang ZhangYongxin ZhuXiangxiang WangHuang ChenXing SunLinli XuPublished in: AAAI (2024)
Keyphrases
- speech recognition
- hidden markov models
- language model
- speech processing
- speech recognizer
- pattern recognition
- automatic speech recognition
- speech synthesis
- speaker identification
- speech recognition technology
- speech understanding
- visual features
- speech signal
- keyword spotting
- speech retrieval
- speech recognition errors
- visual information
- isolated word
- low level
- speech recognizers
- visual data
- speech recognition systems
- machine learning
- noisy environments
- cepstral coefficients
- speaker diarization