GROOViST: A Metric for Grounding Objects in Visual Storytelling.
Aditya K. SurikuchiSandro PezzelleRaquel FernándezPublished in: EMNLP (2023)
Keyphrases
- visual objects
- visual appearance
- spatial relations
- real world objects
- visual information
- data objects
- multiple objects
- visual data
- visual scene
- visual perception
- visual features
- virtual environment
- d objects
- data model
- visual input
- multiscale
- object description
- spatial configurations
- semantically relevant
- object models
- metric learning
- metric space
- object detection
- image retrieval
- video sequences