(Ab)using Images and Sounds for Indirect Instruction Injection in Multi-Modal LLMs.
Eugene BagdasaryanTsung-Yin HsiehBen NassiVitaly ShmatikovPublished in: CoRR (2023)
Keyphrases
- multi modal
- image annotation
- fusing multiple
- image database
- image analysis
- multiple modalities
- image retrieval
- image classification
- input image
- object recognition
- image features
- image registration
- image collections
- multi modality
- segmentation method
- image data
- semantic concepts
- cross modal
- similarity measure
- high dimensional
- single modality
- computer vision
- auto annotation
- web images
- audio visual
- fully automatic
- visual features
- edge detection
- video search
- image regions
- segmentation algorithm