On the Contributions of Visual and Textual Supervision in Low-resource Semantic Speech Retrieval.
Ankita PasadBowen ShiHerman KamperKaren LivescuPublished in: CoRR (2019)
Keyphrases
- speech retrieval
- natural language
- high level
- semantic information
- visual information
- speech recognition
- multimedia
- conversational speech
- semantic web
- visual features
- visual content
- semantic similarity
- automatic speech recognition
- pattern recognition
- semantic analysis
- semantic concepts
- semantic knowledge
- spoken document retrieval
- low level
- active learning