Tri-Modal Dense Video Captioning Based on Fine-Grained Aligned Text and Anchor-Free Event Proposals Generator.
Jingjing NiuYulai XieYang ZhangJinyu ZhangYanfei ZhangXiao LeiFang RenPublished in: Int. J. Pattern Recognit. Artif. Intell. (2022)
Keyphrases
- fine grained
- video segments
- coarse grained
- event recognition
- event detection
- video event
- video data
- video content
- video search
- video sequences
- news stories
- video retrieval
- information retrieval
- video clips
- video analysis
- news video
- tightly coupled
- video frames
- access control
- soccer video
- text mining
- modal logic
- multimedia
- video streams
- database
- free text
- key frames
- keywords
- text processing
- sentence level
- video shots
- knowledge base
- text classification
- data lineage