VLM2Scene: Self-Supervised Image-Text-LiDAR Learning with Foundation Models for Autonomous Driving Scene Understanding.

Guibiao Liao Jiankun Li Xiaoqing Ye

Published in: AAAI (2024)

Keyphrases

scene understanding
d scene
vision system
object recognition
object detection
scene categorization
scene recognition
single image
scene labeling
video surveillance
input image
image features
indoor scenes
scene interpretation
image representation
object detectors
real time
computer vision
image sequences
image classification
reinforcement learning
image retrieval
probabilistic model
grand challenge
image regions
object hypotheses
autonomous driving
input data