Implicit and explicit commonsense for multi-sentence video captioning.
Shih-Han ChouJames J. LittleLeonid SigalPublished in: Comput. Vis. Image Underst. (2024)
Keyphrases
- multimedia
- video data
- video streams
- video frames
- real time
- explicit feedback
- real time video
- video content
- video sequences
- natural language
- space time
- explicit or implicit
- spatial and temporal
- event detection
- knowledge base
- video analysis
- video images
- semantic role labeling
- online video
- temporal information
- video database
- video retrieval
- video surveillance
- multimedia data
- computer vision