Implicit and explicit commonsense for multi-sentence video captioning.

Shih-Han Chou James J. Little Leonid Sigal

Published in: Comput. Vis. Image Underst. (2024)

Keyphrases

multimedia
video data
video streams
video frames
real time
explicit feedback
real time video
video content
video sequences
natural language
space time
explicit or implicit
spatial and temporal
event detection
knowledge base
video analysis
video images
semantic role labeling
online video
temporal information
video database
video retrieval
video surveillance
multimedia data
computer vision