Commonsense for Zero-Shot Natural Language Video Localization.
Meghana HollaIsmini LourentzouPublished in: AAAI (2024)
Keyphrases
- natural language
- activity detection
- video data
- video content
- video sequences
- natural language descriptions
- natural language interface
- multimedia
- video streams
- real time
- real time video
- knowledge base
- video clips
- video database
- machine learning
- key frames
- video processing
- natural language processing
- commonsense knowledge
- event recognition
- multimedia data
- semantic analysis
- face detection
- video analysis
- video retrieval
- space time
- spatial and temporal
- question answering
- language processing
- natural language understanding
- object categories
- semantic interpretation
- neural network
- multi view
- online video
- data sets