Sign in

Self-Supervised Learning for Audio-Visual Relationships of Videos With Stereo Sounds.

Tomoya SatoYusuke SuganoYoichi Sato
Published in: IEEE Access (2022)
Keyphrases
  • audio visual
  • computer vision
  • metadata
  • three dimensional
  • multi modal
  • multimedia
  • video content
  • audio features
  • temporal context