Multi-Speaker Video Dialog with Frame-Level Temporal Localization.
Qiang WangPin JiangZhiyi GuoYahong HanZhou ZhaoPublished in: AAAI (2020)
Keyphrases
- temporal coherence
- video frames
- temporal correlation
- temporal information
- spatial and temporal
- temporal redundancy
- adjacent frames
- temporal continuity
- activity detection
- key frames
- spatio temporally
- spatio temporal
- video sequences
- temporal structure
- video content
- temporal consistency
- video data
- space time
- image frames
- temporal domain
- video coding
- dynamic textures
- temporal segmentation
- temporal analysis
- input video
- spatial temporal
- video streams
- real time
- conversational agents
- temporal order
- neighboring frames
- multimedia
- temporal resolution
- single frame
- video analysis
- temporal constraints
- event detection
- speaker verification
- speaker recognition
- temporal relations
- motion trajectories
- audio visual
- inter frame
- video clips