VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding.
Chris KellyLuhui HuJiayin HuYu TianDeshun YangBang YangCindy YangZihao LiZaoshan HuangYuexian ZouPublished in: CoRR (2024)
Keyphrases
- multi agent systems
- multi agent
- vision system
- intelligent agents
- multiagent systems
- computer vision
- image processing
- multimodal interaction
- dynamic environments
- software agents
- autonomous agents
- cooperating agents
- pedagogical agents
- real time
- mobile agents
- cooperative
- agent architecture
- audio visual
- agent model
- multiple agents
- action selection
- deeper understanding
- bdi agents
- multimodal interfaces
- neural network