Multi-modal Dependency Tree for Video Captioning.
Wentian ZhaoXinxiao WuJiebo LuoPublished in: NeurIPS (2021)
Keyphrases
- multi modal
- dependency tree
- semantic concepts
- video search
- relation extraction
- dependency parsing
- video data
- multi modality
- video sequences
- video content
- audio visual
- multiple modalities
- multimedia
- video frames
- syntactic structures
- information retrieval
- key frames
- sentiment classification
- automatic extraction
- question answering
- keywords
- feature extraction