VD-GR: Boosting Visual Dialog with Cascaded Spatial-Temporal Multi-Modal GRaphs.
Adnen AbdessaiedLei ShiAndreas BullingPublished in: WACV (2024)
Keyphrases
- multi modal
- spatial temporal
- cross modal
- single modality
- video search
- video shots
- action recognition
- spatio temporal
- spatial and temporal
- multi modality
- semantic concepts
- temporal information
- visual features
- feature selection
- image annotation
- machine learning
- visual information
- high dimensional
- human actions
- visual data
- object recognition
- image sequences