VLM2Scene: Self-Supervised Image-Text-LiDAR Learning with Foundation Models for Autonomous Driving Scene Understanding.
Guibiao LiaoJiankun LiXiaoqing YePublished in: AAAI (2024)
Keyphrases
- scene understanding
- d scene
- vision system
- object recognition
- object detection
- scene categorization
- scene recognition
- single image
- scene labeling
- video surveillance
- input image
- image features
- indoor scenes
- scene interpretation
- image representation
- object detectors
- real time
- computer vision
- image sequences
- image classification
- reinforcement learning
- image retrieval
- probabilistic model
- grand challenge
- image regions
- object hypotheses
- autonomous driving
- input data