Attend What You Need: Motion-Appearance Synergistic Networks for Video Question Answering.
Ahjeong SeoGi-Cheon KangJoonhan ParkByoung-Tak ZhangPublished in: CoRR (2021)
Keyphrases
- question answering
- object motion
- dynamic textures
- space time
- key frames
- information extraction
- natural language processing
- information retrieval
- image sequences
- multimedia
- qa clef
- cross language
- question classification
- video sequences
- named entities
- video content
- natural language
- passage retrieval
- visual data
- semantic roles
- open domain question answering
- qa systems
- question answering systems
- natural language questions
- video frames
- video data
- relation extraction
- video retrieval
- syntactic information
- human motion
- answer extraction
- video search
- human actions
- answer validation
- image classification