A Transformer-based Cross-modal Fusion Model with Adversarial Training for VQA Challenge 2021.
Ke-Han LuBo-Han FangKuan-Yu ChenPublished in: CoRR (2021)
Keyphrases
- cross modal
- fusion model
- multi modal
- information fusion
- training set
- image retrieval
- image database
- fuzzy logic
- visual data
- visual recognition
- multimedia retrieval
- image registration
- fusion framework
- multiscale
- multimedia
- estimation algorithm
- feature extraction
- databases
- database systems
- multimedia databases
- metadata
- knowledge base