Look, Listen, and Answer: Overcoming Biases for Audio-Visual Question Answering.
Jie MaMin HuPinghui WangWangchun SunLingyun SongHongbin PeiJun LiuYoutian DuPublished in: CoRR (2024)
Keyphrases
- audio visual
- question answering
- question classification
- natural language questions
- passage retrieval
- question answering systems
- qa systems
- candidate answers
- answer validation
- multi modal
- answering questions
- answer extraction
- question answer pairs
- visual information
- natural language processing
- qa clef
- multimedia
- natural language
- named entities
- information retrieval
- visual data
- correct answers
- information extraction
- image features
- domain knowledge