Publication: MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning.