Dual-Level Decoupled Transformer for Video Captioning.

Yiqi Gao Xinglin Hou Wei Suo Mengyang Sun Tiezheng Ge Yuning Jiang Peng Wang

Published in: ICMR (2022)

Keyphrases

video sequences
video content
video frames
video streams
fuzzy logic
multimedia
video segmentation
video data
real time video
levels of abstraction
power system
spatio temporal
space time
low level
multimedia data
key frames
image retrieval
lower level
video database
face recognition
artificial intelligence
surveillance videos
video processing
compressed video
neural network