The multi-modal fusion in visual question answering: a review of attention mechanisms.
Siyu LuMingzhe LiuLirong YinZhengtong YinXuan LiuWenfeng ZhengPublished in: PeerJ Comput. Sci. (2023)
Keyphrases
- question answering
- multi modal fusion
- question classification
- information retrieval
- visual information
- natural language processing
- natural language
- information extraction
- passage retrieval
- qa clef
- question answering systems
- cross language
- syntactic information
- natural language questions
- low level
- named entities
- qa systems
- sentence retrieval
- data mining