Robust visual question answering via semantic cross modal augmentation.
Akib MashrurWei LuoNayyar Abbas ZaidiAntonio Robles-KellyPublished in: Comput. Vis. Image Underst. (2024)
Keyphrases
- question answering
- cross modal
- multi modal
- natural language
- visual similarity
- question answering systems
- natural language processing
- multimedia retrieval
- semantic roles
- information extraction
- image retrieval
- information retrieval
- semantic concepts
- named entities
- answer extraction
- visual data
- multimedia databases
- domain knowledge
- multimedia
- training set
- machine learning