RAMM: Retrieval-augmented Biomedical Visual Question Answering with Multi-modal Pre-training.
Zheng YuanQiao JinChuanqi TanZhengyun ZhaoHongyi YuanFei HuangSongfang HuangPublished in: CoRR (2023)
Keyphrases
- multi modal
- question answering
- cross modal
- video search
- passage retrieval
- information retrieval
- information extraction
- audio visual
- natural language processing
- sentence retrieval
- single modality
- answer extraction
- question answering systems
- named entities
- syntactic information
- natural language
- cross language
- natural language questions
- high dimensional
- training set
- speech transcripts
- text retrieval
- visual information
- document retrieval
- image retrieval
- retrieval systems
- information retrieval systems
- test set
- text categorization
- visual features
- query expansion
- machine learning
- image classification
- multiple modalities
- qa systems
- broadcast news
- test collection
- semantic concepts
- video content