Enhance Temporal Relations in Audio Captioning with Sound Event Detection.
Zeyu XieXuenan XuMengyue WuKai YuPublished in: INTERSPEECH (2023)
Keyphrases
- event detection
- temporal relations
- complex events
- video event
- primitive events
- soccer video
- event recognition
- temporal information
- video analysis
- temporal reasoning
- video event detection
- video surveillance
- temporal structure
- spatial relations
- multimedia
- composite events
- activity recognition
- video clips
- mid level
- scan statistic
- active database management systems
- sports video
- visual data
- video streams
- space time
- image sequences