Sign in

Audio-Visual Scene-Aware Dialog and Reasoning Using Audio-Visual Transformers with Joint Student-Teacher Learning.

Ankit P. ShahShijie GengPeng GaoAnoop CherianTakaaki HoriTim K. MarksJonathan Le RouxChiori Hori
Published in: ICASSP (2022)
Keyphrases
  • audio visual
  • visual data
  • multi modal
  • visual information
  • multi stream
  • multimedia
  • learning process
  • online learning
  • active learning
  • machine learning
  • high level
  • image sequences
  • student teachers