Sign in

Self-supervised Audio Teacher-Student Transformer for Both Clip-level and Frame-level Tasks.

Xian LiNian ShaoXiaofei Li
Published in: CoRR (2023)
Keyphrases
  • multimedia
  • learning algorithm
  • video data