Publication: Video-Grounded Dialogues with Joint Video and Image Training.