Fusion of Multi-Modal Features to Enhance Dense Video Caption.
Xuefei HuangKa-Hou ChanWeifan WuHao ShengWei KePublished in: Sensors (2023)
Keyphrases
- multi modal
- multi modality
- multiple modalities
- video search
- semantic concepts
- audio visual
- fusing multiple
- low level
- feature extraction
- feature space
- feature set
- single modality
- high dimensional
- key frames
- cross modal
- uni modal
- auto annotation
- video clips
- image annotation
- image features
- feature vectors
- video sequences
- visual cues
- video retrieval
- video streams
- video data
- image processing