A deep co-attentive hand-based video question answering framework using multi-view skeleton.
Razieh RastgooKourosh KianiSergio EscaleraPublished in: Multim. Tools Appl. (2023)
Keyphrases
- multi view
- question answering
- single view
- multi view learning
- multiple views
- depth map
- natural language
- free viewpoint
- d objects
- information extraction
- natural language processing
- view synthesis
- three dimensional
- named entities
- information retrieval
- semi supervised
- multimedia
- depth video
- question answering systems
- supervised learning
- training set
- visual data
- video content
- semi supervised learning
- probabilistic model