Implicit and Explicit Commonsense for Multi-sentence Video Captioning.
Shih-Han ChouJames J. LittleLeonid SigalPublished in: CoRR (2023)
Keyphrases
- multimedia
- video data
- video sequences
- video clips
- video frames
- video content
- natural language
- real time
- knowledge base
- video streams
- video images
- video analysis
- video retrieval
- online video
- real time video
- multimedia data
- video database
- explicit feedback
- video shots
- sentence level
- event recognition
- co occurrence
- image sequences
- commonsense reasoning