HVM-1: Large-scale video models pretrained with nearly 5000 hours of human-like video data.
A. Emin OrhanPublished in: CoRR (2024)
Keyphrases
- video data
- video content
- video streams
- video analysis
- video frames
- video sequences
- digital video
- video camera
- surveillance videos
- video editing
- video database
- video browsing
- video indexing
- key frames
- video clips
- multimedia
- video retrieval
- video shots
- visual data
- video annotation
- motion trajectories
- shot boundaries
- multimedia systems
- video abstraction
- video footage
- temporal structure
- surveillance cameras
- video recordings
- computer vision
- video dataset
- three dimensional
- motion features
- content based video retrieval
- content based indexing
- object tracking