SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM.
Jielin QiuAndrea MadottoZhaojiang LinPaul A. CrookYifan Ethan XuXin Luna DongChristos FaloutsosLei LiBabak DamavandiSeungwhan MoonPublished in: CoRR (2024)
Keyphrases
- question answering
- passage retrieval
- named entities
- information retrieval
- document retrieval
- sentence retrieval
- answer extraction
- information extraction
- question classification
- natural language processing
- retrieval model
- question answering systems
- audio visual
- qa clef
- visual information
- natural language questions
- cross language
- multi modal
- open domain question answering
- low level
- test collection
- natural language
- visual features
- query expansion
- syntactic information
- relation extraction
- speech transcripts
- knowledge base
- information retrieval systems
- relevance ranking
- text mining
- candidate answers
- relevance feedback
- image retrieval
- multimedia