Login / Signup
GPT2MVS: Generative Pre-trained Transformer-2 for Multi-modal Video Summarization.
Jia-Hong Huang
Luka Murn
Marta Mrak
Marcel Worring
Published in:
CoRR (2021)
Keyphrases
</>
multi modal
video summarization
pre trained
audio visual
training data
training examples
generative model
motion vectors
key frames
video content
video data
event detection
high dimensional
video sequences
video frames
small number
supervised learning
image retrieval
data sets
face recognition
feature selection