NuScenes-QA: A Multi-Modal Visual Question Answering Benchmark for Autonomous Driving Scenario.
Tianwen QianJingjing ChenLinhai ZhuoYang JiaoYu-Gang JiangPublished in: AAAI (2024)
Keyphrases
- multi modal
- question answering
- autonomous driving
- cross modal
- grand challenge
- video search
- single modality
- question classification
- information retrieval
- open domain question answering
- passage retrieval
- cross language
- natural language
- information extraction
- qa clef
- question answering systems
- audio visual
- natural language questions
- natural language processing
- qa systems
- image annotation
- syntactic information
- visual data
- answer extraction
- visual information
- stereo vision
- candidate answers
- visual features
- sentence retrieval
- low level
- high dimensional
- semantic roles
- speech transcripts