Multimodal Word Discovery and Retrieval With Spoken Descriptions and Visual Concepts.
Liming WangMark Hasegawa-JohnsonPublished in: IEEE ACM Trans. Audio Speech Lang. Process. (2020)
Keyphrases
- visual concepts
- learning tasks
- video content
- semantic concepts
- visual content
- image collections
- image content
- object categories
- keywords
- visual information
- multi modal
- high level
- image annotation
- visual features
- co occurrence
- information retrieval
- image retrieval
- semantic gap
- multimedia
- video retrieval
- learning algorithm
- image patches
- information retrieval systems
- image database
- image understanding
- multimedia databases
- object classes
- multi label