Dual-Scale Alignment-Based Transformer on Linguistic Skeleton Tags for Non-Autoregressive Video Captioning.

Published in: ICME (2022)

Keyphrases