Multi-Modal Retrieval For Large Language Model Based Speech Recognition.
Aditya GouravJari KolehmainenPrashanth Gurunath ShivakumarYile GuGrant P. StrimelAnkur GandheAriya RastrowIvan BulykoPublished in: ACL (Findings) (2024)
Keyphrases
- multi modal
- speech recognition
- isolated word
- cross modal
- hidden markov models
- video search
- automatic speech recognition
- language model
- speech signal
- multi modality
- information retrieval
- pattern recognition
- speech synthesis
- speech retrieval
- speech recognizer
- audio visual
- speech recognition technology
- speaker independent
- noisy environments
- speech recognition systems
- image retrieval
- high dimensional
- speaker identification
- test collection
- semantic concepts
- relevance feedback
- uni modal
- multimedia information retrieval
- language processing
- natural language
- neural network
- maximum likelihood