Sign in

Self-Supervised Audio-Visual Representation Learning for in-the-wild Videos.

Zishun FengMing TuRui XiaYuxuan WangAshok K. Krishnamurthy
Published in: IEEE BigData (2020)
Keyphrases
  • audio visual
  • multimedia
  • databases
  • video sequences
  • data analysis
  • data management
  • text classification
  • multi modal
  • semantic information
  • human actions
  • video summarization
  • temporal context