Generative Video Transformer: Can Objects be the Words?

Yi-Fu Wu Jaesik Yoon Sungjin Ahn

Published in: ICML (2021)

Keyphrases

video images
multimedia objects
multimedia
visual data
semantic labels
video sequences
generative model
video segments
video frames
video streams
video data
video clips
spatial and temporal
fuzzy logic
multimedia data
related words
objects in video sequences
multiple objects
real time
d objects
neural network
video analysis
video content
data objects
space time
spatio temporal
object detection and tracking