Once and for All: Self-supervised Multi-modal Co-training on One-billion Videos at Alibaba.
Lianghua HuangYu LiuXiangzeng ZhouAnsheng YouMing LiBin WangYingya ZhangPan PanYinghui XuPublished in: ACM Multimedia (2021)
Keyphrases
- multi modal
- co training
- video search
- semi supervised learning
- semi supervised
- multi view
- unlabeled data
- text classification
- single view
- supervised learning
- labeled data
- semantic concepts
- multi modality
- video sequences
- email classification
- audio visual
- named entities
- video frames
- video content
- training examples
- cross modal
- active learning
- uni modal
- video data
- image annotation
- multimedia
- multiple views
- information retrieval
- co occurrence
- high dimensional
- training data