Login / Signup
3DMIT: 3D Multi-modal Instruction Tuning for Scene Understanding.
Zeju Li
Chao Zhang
Xiaoyan Wang
Ruilong Ren
Yifan Xu
Ruifei Ma
Xiangde Liu
Published in:
CoRR (2024)
Keyphrases
</>
multi modal
scene understanding
object detection
vision system
object recognition
scene recognition
d scene
video surveillance
multi modality
audio visual
high dimensional
cross modal
multimedia
scene categorization
video search
image annotation
multiscale
multiple modalities
real time
uni modal