Commonsense for Zero-Shot Natural Language Video Localization.
Meghana HollaIsmini LourentzouPublished in: CoRR (2023)
Keyphrases
- natural language
- activity detection
- natural language descriptions
- video sequences
- multimedia
- video data
- video content
- video frames
- video streams
- video analysis
- real time video
- natural language interface
- semantic representation
- video processing
- knowledge representation
- knowledge base
- video retrieval
- natural language generation
- video database
- commonsense knowledge
- machine learning
- language processing
- video clips
- key frames
- space time
- digital video
- video segmentation
- object localization
- video images
- semantic interpretation
- temporal information
- language understanding
- localization algorithm
- natural language processing
- real time
- accurate localization
- video event detection