TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation.

Tanzila Rahman Mengyu Yang Leonid Sigal

Published in: CoRR (2021)

Keyphrases

visual representation
learning process
visual information
visual representations
human centric
user interface
context aware
ambient intelligence
transfer learning
computing environments
sound source