MMGER: Multi-Modal and Multi-Granularity Generative Error Correction With LLM for Joint Accent and Speech Recognition.
Bingshen MuXucheng WanNaijun ZhengHuan ZhouLei XiePublished in: IEEE Signal Process. Lett. (2024)
Keyphrases
- speech recognition
- multi modal
- error correction
- multi granularity
- multi user
- dynamic integration
- hidden markov models
- automatic speech recognition
- language model
- high dimensional
- pattern recognition
- noisy environments
- location aware
- privacy protection
- audio visual
- watermarking scheme
- discriminative training
- speech signal
- speaker diarization
- speech recognition systems
- speaker identification
- image processing
- video search
- context aware
- multiscale
- neural network