Tech

How a Seemingly Harmless Image Can Jailbreak Vision-Language AI Models

Published

3 hours ago

28 June 2026

NewsAdmin

Slashdot reader BrianFagioli writes: Florida International University researchers have developed a technique called JaiLIP (Jailbreaking with Loss-guided Image Perturbation) that uses subtle image modifications to bypass AI safety guardrails. Unlike traditional jailbreaks that rely on carefully crafted prompts, the attack works through images that appear normal to human viewers.

The researchers tested the technique against BLIP-2, a multimodal AI model, and found that manipulated images significantly increased the likelihood of harmful responses. According to the study, the approach outperformed previous image-based jailbreak methods and nearly doubled the number of unsafe outputs generated during testing.

The findings highlight a potential security risk for businesses deploying AI systems that process both images and text. While most discussions about AI safety focus on prompts, the research suggests that seemingly harmless images may also serve as an attack vector.

Source link

WordUp News

Tech

How a Seemingly Harmless Image Can Jailbreak Vision-Language AI Models

Leave a Reply
Cancel reply

Leave a Reply

Trending

Leave a Reply Cancel reply

Leave a Reply

Trending

Leave a Reply
Cancel reply