VATEX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research.
Xin WangJiawei WuJunkun ChenLei LiYuan-Fang WangWilliam Yang WangPublished in: CoRR (2019)
Keyphrases
- high quality
- trecvid multimedia event detection
- event recognition
- event detection
- weakly labeled
- language specific
- video content
- video data
- real life
- human actions
- language resources
- natural language
- real time
- language learning
- multimedia
- video sequences
- digital video
- low quality
- video analysis
- web videos
- video database
- video clips
- video surveillance
- video frames
- digital libraries
- small scale
- programming language
- spatio temporal
- higher quality
- benchmark datasets
- video retrieval
- ground truth
- real world
- video collections
- parallel corpus
- video dataset
- video shots
- video streams
- database
- machine translation system
- depth map
- million images
- space time
- machine translation