VaTeX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research.
Xin WangJiawei WuJunkun ChenLei LiYuan-Fang WangWilliam Yang WangPublished in: ICCV (2019)
Keyphrases
- high quality
- trecvid multimedia event detection
- event recognition
- language specific
- video data
- video sequences
- event detection
- human actions
- language resources
- weakly labeled
- natural language
- multimedia
- benchmark datasets
- video content
- language learning
- small scale
- low quality
- video frames
- space time
- video dataset
- video collections
- video retrieval
- higher quality
- video clips
- multilingual documents
- closed captions
- real time
- ground truth
- programming language
- parallel corpus
- digital libraries
- web videos
- natural language processing
- machine translation system
- cross lingual
- video database
- cross language information retrieval
- video search
- video streams
- indian languages
- key frames
- real world
- spatio temporal
- video analysis
- text generation
- information extraction
- million images
- comparable corpora
- database
- image quality
- feature set
- linguistic knowledge