Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation.
Hongtao WuYa JingChilam CheangGuangzeng ChenJiafeng XuXinghang LiMinghuan LiuHang LiTao KongPublished in: ICLR (2024)
Keyphrases
- visual cues
- video sequences
- visual input
- visual data
- mobile robot
- manipulation tasks
- visual analysis
- video data
- real time
- low level
- video streams
- classifier training
- visual information
- robot navigation
- training set
- multimedia
- content based video retrieval
- human robot interaction
- landmark recognition
- visual landmarks
- training process
- space time
- visual concepts
- high level
- vision system
- visual features
- video search
- video content
- humanoid robot
- training examples
- video clips
- supervised learning
- motor learning
- service robots
- key frames
- event recognition
- path planning
- autonomous robots
- event detection
- video database
- video analysis
- visual attention
- video frames
- discriminative training
- multi robot
- generative model
- motor skills
- robotic systems