Learning CLIP Guided Visual-Text Fusion Transformer for Video-based Pedestrian Attribute Recognition.
Jun ZhuJiandong JinZihan YangXiaohao WuXiao WangPublished in: CVPR Workshops (2023)
Keyphrases
- visual learning
- learning process
- learning systems
- visual recognition
- object recognition
- online learning
- recognition accuracy
- object detection
- document analysis
- unsupervised learning
- active learning
- learning tasks
- reinforcement learning
- learning algorithm
- information retrieval
- hidden markov models
- machine learning
- visual features