Keyword localisation in untranscribed speech using visually grounded speech models.
Kayode OlaleyeDan OneataHerman KamperPublished in: CoRR (2022)
Keyphrases
- speech recognition
- speech signal
- spoken language
- probabilistic model
- speech synthesis
- audio visual
- automatic speech recognition
- automatic speech recognition systems
- bayesian networks
- text to speech synthesis
- recognition engine
- multimodal interfaces
- text to speech
- spoken dialogue systems
- broadcast news
- classification models
- statistical models
- statistical model
- complex systems
- model selection
- keywords