AVLnet: Learning Audio-Visual Language Representations from Instructional Videos.

Andrew Rouditchenko Angie W. Boggust David Harwath Dhiraj Joshi Samuel Thomas Kartik Audhkhasi Rogério Feris Brian Kingsbury Michael Picheny Antonio Torralba James R. Glass

Published in: CoRR (2020)

Keyphrases