An End-to-End Visual-Audio Attention Network for Emotion Recognition in User-Generated Videos.

Published in: CoRR (2020)

Keyphrases