Supervised Multimodal Bitransformers for Classifying Images and Text.
Douwe KielaSuvrat BhooshanHamed FiroozDavide TestugginePublished in: ViGIL@NeurIPS (2019)
Keyphrases
- ground truth
- input image
- image data
- image database
- image features
- three dimensional
- object recognition
- textual information
- image classification
- information retrieval
- image analysis
- segmentation algorithm
- multi modal
- image registration
- web images
- rigid body
- image matching
- historical documents
- text mining
- lighting conditions
- text extraction
- historical manuscripts
- test images
- supervised learning
- semi supervised
- image retrieval
- face recognition
- image annotation
- text information
- similarity measure
- learning algorithm