Multi-Modal Retrieval For Large Language Model Based Speech Recognition.
Jari KolehmainenAditya GouravPrashanth Gurunath ShivakumarYile GuAnkur GandheAriya RastrowGrant P. StrimelIvan BulykoPublished in: CoRR (2024)
Keyphrases
- multi modal
- speech recognition
- isolated word
- cross modal
- hidden markov models
- video search
- language model
- speech recognizer
- information retrieval
- speech synthesis
- pattern recognition
- multi modality
- automatic speech recognition
- speech signal
- test collection
- speaker identification
- speech retrieval
- natural language
- spoken language
- relevance feedback
- image processing
- speech recognition technology
- multimedia databases
- speaker independent
- speech recognition systems
- noisy environments
- high dimensional
- uni modal
- speaker diarization
- information retrieval systems
- edge detection
- document analysis
- language processing
- neural network
- speaker adaptation
- semantic concepts
- image annotation
- multimedia data
- retrieval systems