Multi-Modal Automatic Prosody Annotation with Contrastive Pretraining of SSWP.
Jinzuomu ZhongYang LiHui HuangJie LiuZhiba SuJing GuoBenlai TangFengjie ZhuPublished in: CoRR (2023)
Keyphrases
- multi modal
- image annotation
- automatic annotation
- manual annotation
- audio visual
- automatic image annotation
- multi modality
- metadata
- semi automatic
- fully automatic
- cross modal
- uni modal
- active learning
- image search
- semantic concepts
- high dimensional
- video search
- single modality
- smart room
- image retrieval
- auto annotation