HumanVLA: Towards Vision-Language Directed Object Rearrangement by Physical Humanoid.
Xinyu XuYizheng ZhangYong-Lu LiLei HanCewu LuPublished in: CoRR (2024)
Keyphrases
- computer vision
- d objects
- programming language
- vision system
- real time
- language learning
- data objects
- physical objects
- humanoid robot
- high level
- visual perception
- object identity
- physical world
- natural language
- complex scenes
- target object
- object segmentation
- specification language
- real world objects
- spatial relations
- visual input