MMGER: Multi-modal and Multi-granularity Generative Error Correction with LLM for Joint Accent and Speech Recognition.
Bingshen MuYangze LiQijie ShaoKun WeiXucheng WanNaijun ZhengHuan ZhouLei XiePublished in: CoRR (2024)
Keyphrases
- speech recognition
- multi modal
- error correction
- multi granularity
- multi user
- hidden markov models
- dynamic integration
- automatic speech recognition
- language model
- pattern recognition
- location aware
- speech signal
- privacy protection
- audio visual
- video search
- speech recognition systems
- watermarking scheme
- noisy environments
- speaker diarization
- user interface