Leveraging Text Representation and Face-head Tracking for Long-form Multimodal Semantic Relation Understanding.

Published in: ACM Multimedia (2022)

Keyphrases