BARTPhoBEiT: Pre-trained Sequence-to-Sequence and Image Transformers Models for Vietnamese Visual Question Answering.
Khiem Vinh TranKiet Van NguyenNgan Luu-Thuy NguyenPublished in: CoRR (2023)
Keyphrases
- question answering
- pre trained
- natural language processing
- image features
- statistical model
- input image
- information retrieval
- information extraction
- single image
- image representation
- passage retrieval
- low level
- image retrieval
- training data
- image content
- multi modal
- probabilistic model
- named entities
- document retrieval
- question answering systems
- qa clef