Stop Reasoning! When Multimodal LLMs with Chain-of-Thought Reasoning Meets Adversarial Images.
Zefeng WangZhen HanShuo ChenFan XueZifeng DingXun XiaoVolker TrespPhilip H. S. TorrJindong GuPublished in: CoRR (2024)
Keyphrases
- image data
- image retrieval
- image analysis
- edge detection
- input image
- reasoning process
- image features
- object recognition
- image registration
- image classification
- multiple images
- spatial reasoning
- knowledge representation
- ground truth
- test images
- image collections
- three dimensional
- computer vision
- multi agent
- multi modal
- keypoints
- spatial relationships