Scene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship Detection.
Tim SalzmannMarkus RyllAlex BewleyMatthias MindererPublished in: CoRR (2024)
Keyphrases
- end to end
- text localization and recognition
- congestion control
- real world
- multipath
- wireless ad hoc networks
- admission control
- observed scene
- high bandwidth
- ad hoc networks
- object detection
- real time
- visual information
- d scene
- visual features
- content delivery
- application layer
- rate allocation
- scalable video
- internet protocol
- rate adaptation