InterFormer: Interactive Local and Global Features Fusion for Automatic Speech Recognition.
Zhi-Hao LaiTian-Hao ZhangQi LiuXinyuan QianLi-Fang WeiFeng ChenSong-Lu ChenXu-Cheng YinPublished in: INTERSPEECH (2023)
Keyphrases
- automatic speech recognition
- global features
- speech recognition
- visual features
- speech signal
- hidden markov models
- image features
- word error rate
- broadcast news
- keypoints
- feature vectors
- speech retrieval
- conversational speech
- spoken words
- word recognition
- noisy environments
- acoustic features
- speech corpus
- classification method
- speech synthesis
- pattern recognition
- scale space
- image analysis
- image retrieval
- multiscale
- decision trees
- computer vision
- learning algorithm
- neural network