Publication: Video captioning based on vision transformer and reinforcement learning.