BARTPhoBEiT: Pre-trained Sequence-to-Sequence and Image Transformers Models for Vietnamese Visual Question Answering.
Khiem Vinh TranKiet Van NguyenNgan Luu-Thuy NguyenPublished in: MAPR (2023)
Keyphrases
- question answering
- pre trained
- input image
- single image
- natural language processing
- image content
- named entities
- image retrieval
- low level
- probabilistic model
- image features
- data sets
- image classification
- information retrieval
- statistical model
- feature selection
- computer vision
- artificial intelligence
- learning algorithm
- image set
- illumination conditions
- cross language