LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning.
Sijin ChenXin ChenChi ZhangMingsheng LiGang YuHao FeiHongyuan ZhuJiayuan FanTao ChenPublished in: CoRR (2023)
Keyphrases
- cognitive processing
- visual information
- model based reasoning
- visual representations
- mixed initiative
- incomplete knowledge
- visual analysis
- solving problems
- multimedia
- visual processing
- plan execution
- reasoning process
- virtual reality
- technical systems
- user interaction
- visual features
- decision support
- htn planning
- forward chaining
- cognitive skills
- omni directional
- visual data mining
- knowledge base
- temporal planning
- cognitive abilities
- description logics
- plan generation
- cognitive processes
- computer graphics
- heuristic search
- ai planning
- planning problems
- cognitive architecture