Cross-Modal and Hierarchical Modeling of Video and Text.
Bowen ZhangHexiang HuFei ShaPublished in: CoRR (2018)
Keyphrases
- cross modal
- multiple modalities
- multi modal
- visual data
- video sequences
- video data
- text mining
- semantic concepts
- multimedia
- visual recognition
- video retrieval
- text retrieval
- video streams
- video frames
- video clips
- video analysis
- image retrieval
- multimedia documents
- multimedia retrieval
- keywords
- perceptual information
- multimedia databases
- image representation
- low level
- information retrieval