Visual Oriented Encoder: Integrating Multimodal and Multi-Scale Contexts for Video Captioning.
Bang YangYuexian ZouPublished in: ICPR (2020)
Keyphrases
- multiscale
- multimodal information
- video encoder
- visual data
- multimedia
- visual cues
- video data
- audio visual
- visual analysis
- multi modal
- video search
- real time video
- visual information
- video streams
- real time
- story segmentation
- video frames
- video encoding
- multimodal interaction
- video compression
- edge detection
- news video
- video sequences
- multiple modalities
- video content
- low complexity
- visual features
- visual speech
- low level
- natural images
- key frames
- scale space
- visual concepts
- temporal correlation
- mpeg standard
- image segmentation
- video retrieval
- error control
- rate distortion
- bit rate
- compressed video
- cross modal
- image classification
- video database