A Multi -level Alignment Training Scheme for Video-and-Language Grounding.
Yubo ZhangFeiyang NiuQing PingGovind ThattaiPublished in: ICDM (Workshops) (2022)
Keyphrases
- programming language
- video content
- video sequences
- real time
- training process
- video data
- multimedia
- video encoder
- video analysis
- language learning
- natural language
- video frames
- multimedia data
- key frames
- space time
- training samples
- video surveillance
- video retrieval
- online learning
- word alignment
- parallel corpus
- training data