From Deterministic to Generative: Multimodal Stochastic RNNs for Video Captioning.
Jingkuan SongYuyu GuoLianli GaoXuelong LiAlan HanjalicHeng Tao ShenPublished in: IEEE Trans. Neural Networks Learn. Syst. (2019)
Keyphrases
- recurrent neural networks
- multimedia
- video data
- stochastic optimization problems
- video content
- real time
- video sequences
- video streams
- stochastic methods
- multi modal
- multimodal interaction
- video analysis
- video clips
- black box
- video shots
- spatial and temporal
- video frames
- real time video
- online video
- data driven
- story segmentation
- multimodal information
- object detection
- key frames
- human activities
- video retrieval
- visual data
- unsupervised learning
- video processing
- generative model