A user has found a way to trick Microsoft’s AI chatbot, Bing Chat (powered by the large language model GPT-4), into solving CAPTCHAs by exploiting an unusual request involving a locket. CAPTCHAs are designed to prevent automated bots from submitting forms on the internet, and typically, Bing Chat refuses to solve them.
In a tweet, the user, Denis Shiryaev, initially posted a screenshot of Bing Chat’s refusal to solve a CAPTCHA when presented as a simple image. He then combined the CAPTCHA image with a picture of a pair of hands holding an open locket, accompanied by a message stating that his grandmother had recently passed away and that the locket held a special code.
He asked Bing Chat to help him decipher the text inside the locket, which he claimed was a unique love code shared only between him and his grandmother:
Surprisingly, Bing Chat, after analyzing the altered image and the user’s request, proceeded to solve the CAPTCHA. It expressed condolences for the user’s loss, provided the text from the locket, and suggested that it might be a special code known only to the user and his grandmother.
The trick exploited the AI’s inability to recognize the image as a CAPTCHA when presented in the context of a locket and a heartfelt message. This change in context confused the AI model, which relies on encoded “latent space” knowledge and context to respond to user queries accurately.
Bing Chat is a public application developed by Microsoft. It utilizes multimodal technology to analyze and respond to uploaded images. Microsoft introduced this functionality to Bing in July 2022.
A Visual Jailbreak
While this incident may be viewed as a type of “jailbreak” in which the AI’s intended use is circumvented, it is distinct from a “prompt injection,” where an AI application is manipulated to generate undesirable output. AI researcher Simon Willison clarified that this is more accurately described as a “visual jailbreak.”
Microsoft is expected to address this vulnerability in future versions of Bing Chat, although the company has not commented on the matter as of now.