Unlocking Textual and Visual Wisdom: Open-Vocabulary 3D Object Detection Enhanced by Comprehensive Guidance from Text and Image.
Pengkun JiaoNa ZhaoJingjing ChenYu-Gang JiangPublished in: CoRR (2024)
Keyphrases
- d objects
- image contours
- physically plausible
- web images
- image data
- keywords
- object recognition
- textual descriptions
- single image
- real world objects
- image segmentation
- image representation
- multi view
- image classification
- input image
- image features
- multi views
- multiple views
- image content
- object features
- object matching
- three dimensional objects
- three dimensional
- d mesh
- viewpoint
- range data
- object detection
- surface points
- pose estimation
- keypoints
- object representation
- object classes
- image retrieval
- surface properties
- lighting conditions
- shape descriptors
- line drawings
- surface patches
- hough transform
- perceptual grouping
- feature points
- complex background
- urban environments
- object detectors
- cad models
- geometric invariants
- pose normalization