Distributed Audio-Visual Parsing Based On Multimodal Transformer and Deep Joint Source Channel Coding.

Published in: ICASSP (2022)

Keyphrases