WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning.
Krishna SrinivasanKarthik RamanJiecao ChenMichael BenderskyMarc NajorkPublished in: SIGIR (2021)
Keyphrases
- machine learning
- input image
- image data
- multiscale
- image content
- single image
- image dataset
- image set
- image features
- image classification
- text mining
- image analysis
- high resolution
- segmentation method
- image retrieval
- million images
- image collections
- image segmentation
- image representation
- edge detection
- low level
- image regions
- multi lingual
- text generation
- information extraction
- supervised machine learning
- world knowledge
- short texts
- natural language text
- web images
- test images
- text documents
- feature selection
- computer vision
- learning algorithm
- information retrieval