Realizing Visual Question Answering for Education: GPT-4V as a Multimodal AI.
Gyeong-Geon LeeXiaoming ZhaiPublished in: CoRR (2024)
Keyphrases
- question answering
- artificial intelligence
- information retrieval
- natural language processing
- information extraction
- syntactic information
- question classification
- natural language
- named entities
- knowledge representation
- open domain question answering
- cross language
- visual information
- machine learning
- visual features
- passage retrieval
- question answering systems
- natural language questions
- qa clef
- expert systems
- qa systems
- answer extraction
- low level
- multi modal
- semantic roles
- relation extraction
- audio visual
- multimedia
- sql queries
- speech transcripts