Learning explicit video attributes from mid-level representation for video captioning.

Published in: Comput. Vis. Image Underst. (2017)

Keyphrases