Abstractive Multi-Video Captioning: Benchmark Dataset Construction and Extensive Evaluation.
Rikito TakahashiHirokazu KiyomaruChenhui ChuSadao KurohashiPublished in: LREC/COLING (2024)
Keyphrases
- benchmark datasets
- video frames
- real time
- video data
- video content
- video sequences
- multimedia
- neural network
- video search
- video analysis
- pedestrian detection
- human activities
- real time video
- space time
- video segmentation
- audio video
- video streams
- video images
- surveillance videos
- digital video
- video shots
- video database
- multi modal
- video clips
- spatial and temporal
- human body
- video processing
- web search
- event detection
- online video