PitVQA: Image-grounded Text Embedding LLM for Visual Question Answering in Pituitary Surgery.
Runlong HeMengya XuAdrito DasDanyal Z. KhanSophia BanoHani J. MarcusDanail StoyanovMatthew J. ClarksonMobarakol IslamPublished in: CoRR (2024)
Keyphrases
- question answering
- information retrieval
- syntactic information
- low level
- image content
- image classification
- natural language processing
- text summarization
- cross language
- question classification
- named entities
- image representation
- image features
- textual entailment recognition
- sentence retrieval
- open domain question answering
- passage retrieval
- image retrieval
- semantic roles
- qa clef
- visual features
- natural language questions
- relation extraction
- visual data
- text retrieval
- visual information
- information extraction
- probabilistic model