Towards Unseen Triples: Effective Text-Image-joint Learning for Scene Graph Generation.
Qianji DiWenxi MaZhongang QiTianxiang HouYing ShanHanzi WangPublished in: CoRR (2023)
Keyphrases
- input image
- single image
- learning algorithm
- high resolution
- multiscale
- video sequences
- image data
- image segmentation
- image retrieval
- scene images
- image content
- image representation
- scene matching
- imaging process
- previously unseen
- scene understanding
- image matching
- feature points
- image classification
- low level
- image sequences
- d scene
- image set
- image collections
- multiple objects
- visual data
- complex scenes
- spatial information
- reference images
- piecewise planar