Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation.
Wangbo ZhaoKai WangXiangxiang ChuFuzhao XueXinchao WangYang YouPublished in: CoRR (2022)
Keyphrases
- multi modal
- video segmentation
- motion cues
- video sequences
- humanoid robot
- audio visual
- image sequences
- semantic information
- image annotation
- multi modality
- video analysis
- image search
- video frames
- high dimensional
- shot boundary detection
- feature vectors
- semantic concepts
- motion features
- feature set
- image features
- low level
- video search
- cross modal
- single modality
- spatial information
- image classification
- visual features
- video retrieval
- multimedia
- key frames