A Study on Joint Modeling and Data Augmentation of Multi-Modalities for Audio-Visual Scene Classification.

Published in: ISCSLP (2022)

Keyphrases