Supervised Multimodal Bitransformers for Classifying Images and Text.
Douwe KielaSuvrat BhooshanHamed FiroozDavide TestugginePublished in: CoRR (2019)
Keyphrases
- image database
- image features
- image data
- ground truth
- input image
- textual information
- three dimensional
- text information
- edge detection
- test images
- image matching
- image registration
- image analysis
- object recognition
- computer vision
- image collections
- complex background
- text extraction
- text mining
- rigid body
- multiple modalities
- supervised learning
- segmentation algorithm
- image set
- face recognition
- text detection
- multimodal image registration