Generative Video Transformer: Can Objects be the Words?

Yi-Fu Wu Jaesik Yoon Sungjin Ahn

Published in: CoRR (2021)

Keyphrases

semantic labels
video sequences
multimedia
video images
video data
space time
n gram
visual data
multimedia objects
object detection and tracking
generative model
real time
image sequences
object motion
power system
spatial and temporal
video content
dynamic scenes
related words
human activities
video segments
objects in video sequences
multiple objects
data objects
key frames
video streams
fuzzy logic
d objects
expert systems
neural network