Sparse Multi-Modal Graph Transformer with Shared-Context Processing for Representation Learning of Giga-pixel Images.
Ramin NakhliPuria Azadi MoghadamHaoyang MiHossein FarahaniAlexander BarasC. Blake GilksAli BashashatiPublished in: CVPR (2023)
Keyphrases
- multi modal
- image annotation
- fusing multiple
- input image
- high dimensional
- multiple modalities
- image data
- multi modality
- image analysis
- auto annotation
- image features
- image collections
- audio visual
- image retrieval
- pixel values
- image registration
- segmentation method
- segmentation algorithm
- visual recognition
- multimedia
- object recognition
- multiscale