Dual-Level Decoupled Transformer for Video Captioning.
Yiqi GaoXinglin HouWei SuoMengyang SunTiezheng GeYuning JiangPeng WangPublished in: ICMR (2022)
Keyphrases
- video sequences
- video content
- video frames
- video streams
- fuzzy logic
- multimedia
- video segmentation
- video data
- real time video
- levels of abstraction
- power system
- spatio temporal
- space time
- low level
- multimedia data
- key frames
- image retrieval
- lower level
- video database
- face recognition
- artificial intelligence
- surveillance videos
- video processing
- compressed video
- neural network