Multi-modal 3D Human Tracking for Robots in Complex Environment with Siamese Point-Video Transformer.
Shuo XinZhen ZhangMengmeng WangXiaojun HouYaowei GuoXiao KangLiang LiuYong LiuPublished in: ICRA (2024)
Keyphrases
- multi modal
- complex environments
- human tracking
- video search
- human detection
- humanoid robot
- monocular images
- video sequences
- appearance model
- multimedia
- autonomous agents
- dynamical model
- video data
- video frames
- multiple modalities
- mobile robot
- video surveillance
- space time
- multimedia data
- high dimensional
- spatial and temporal
- machine learning
- action recognition
- event detection
- medical images
- image sequences