The MSR-Video to Text dataset with clean annotations.
Haoran ChenJianmin LiSimone FrintropXiaolin HuPublished in: Comput. Vis. Image Underst. (2022)
Keyphrases
- natural language descriptions
- action recognition
- video dataset
- human actions
- semantic labels
- text detection
- video sequences
- keywords
- video content
- video search
- video data
- text information
- natural language
- news video
- video segments
- weakly labeled
- real time
- video streams
- multimedia
- multimedia search
- database
- video frames
- video analysis
- natural scene images
- multimedia documents
- textual descriptions
- audio content
- video material
- information retrieval
- text mining
- user generated
- video retrieval
- text retrieval
- space time
- benchmark datasets
- metadata
- video surveillance
- street view
- closed captions
- multimedia data
- information extraction
- text documents
- image annotation
- key frames