Vision-Language Meets the Skeleton: Progressively Distillation with Cross-Modal Knowledge for 3D Action Representation Learning.

Yang ChenTian HeJunfeng FuLing WangJingcai GuoHong Cheng
Published in: CoRR (2024)