Visual Question Answering Instruction: Unlocking Multimodal Large Language Model To Domain-Specific Visual Multitasks.
Jusung LeeSungguk ChaYounghyun LeeCheoljong YangPublished in: CoRR (2024)
Keyphrases
- question answering
- language model
- domain specific
- information retrieval
- passage retrieval
- n gram
- sentence retrieval
- visual information
- document retrieval
- language modeling
- probabilistic model
- natural language processing
- query expansion
- relation extraction
- information extraction
- visual features
- question classification
- speech recognition