Fusing Temporally Distributed Multi-Modal Semantic Clues for Video Question Answering.
Fuwei ZhangRuomei WangSonghua XuFan ZhouPublished in: ICME (2021)
Keyphrases
- multi modal
- question answering
- semantic concepts
- natural language
- semantic parsing
- video search
- semantic roles
- natural language processing
- question answering systems
- information retrieval
- passage retrieval
- audio visual
- information extraction
- question classification
- cross language
- answer extraction
- syntactic information
- qa clef
- multimedia
- video shots
- video sequences
- image annotation
- multiple modalities
- visual features
- natural language questions
- video frames
- video data
- high level
- video analysis