MINT: A wrapper to make multi-modal and multi-image AI models interactive.
Jan FreybergAbhijit Guha RoyTerry SpitzBeverly FreemanMike SchaekermannPatricia StrachanEva SchniderRenee WongDale R. WebsterAlan KarthikesalingamYun LiuKrishnamurthy DvijothamUmesh TelangPublished in: CoRR (2024)
Keyphrases
- multi modal
- image data
- uni modal
- image analysis
- input image
- multi modality
- image features
- image segmentation
- auto annotation
- image classification
- image retrieval
- image regions
- single modality
- audio visual
- edge detection
- image representation
- image collections
- cross modal
- fusing multiple
- multiscale
- semantic concepts
- feature selection
- similarity measure
- web images
- parametric models
- segmentation method
- contrast enhancement
- video search
- image annotation
- image content
- high resolution
- low level