A focus fusion attention mechanism integrated with image captions for knowledge graph-based visual question answering.
Mingyang MaTurdi TohtiYi LiangZicheng ZuoAskar HamdullaPublished in: Signal Image Video Process. (2024)
Keyphrases
- question answering
- attention mechanism
- visual features
- low level
- image data
- input image
- image classification
- image features
- multiscale
- image content
- information retrieval
- question classification
- knowledge base
- syntactic information
- visual attention
- image regions
- image retrieval
- natural language processing
- information extraction
- domain knowledge
- image representation
- knowledge representation
- qa clef
- visual information
- cross language
- passage retrieval
- expert systems
- higher level
- saliency map
- natural language
- natural language questions
- similarity measure