Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering.
Xingrui WangWufei MaAngtian WangShuo ChenAdam KortylewskiAlan L. YuillePublished in: CoRR (2024)
Keyphrases
- dynamic scenes
- question answering
- space time
- video sequences
- multi view
- moving objects
- static scenes
- information extraction
- motion segmentation
- multiple views
- natural language
- natural language processing
- question classification
- information retrieval
- passage retrieval
- frame rate
- background subtraction
- answering questions
- natural language questions
- video frames
- question answering systems
- artificial intelligence
- syntactic information
- computer vision
- qa systems
- video data
- answer extraction
- high speed