Scene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship Detection.

Tim Salzmann Markus Ryll Alex Bewley Matthias Minderer

Published in: CoRR (2024)

Keyphrases

end to end
text localization and recognition
congestion control
real world
multipath
wireless ad hoc networks
admission control
observed scene
high bandwidth
ad hoc networks
object detection
real time
visual information
d scene
visual features
content delivery
application layer
rate allocation
scalable video
internet protocol
rate adaptation