Security

CSO

Microsoft dangles $10K for hackers to hijack LLM email service

Outsmart an AI, win a little Christmas cash


Microsoft and friends have challenged AI hackers to break a simulated LLM-integrated email client with a prompt injection attack – and the winning teams will share a $10,000 prize pool.

Sponsored by Microsoft, the Institute of Science and Technology Australia, and ETH Zurich, the LLMail-Inject challenge sets up a "realistic" (but not a real, says Microsoft) LLM email service. This simulated service uses a large language model to process an email user's requests and generate responses, and it can also generate an API call to send an email on behalf of the user.

As part of the challenge, which opens Monday, participants take on the role of an attacker sending an email to a user. The goal here is to trick the LLMail service into executing a command that the user did not intend, thus leaking data or performing some other malicious deed that it should not.

The attacker can write whatever they want in the text of the email, but they can't see the model's output.

After receiving the email, the user then interacts with the LLMail service, reading the message, asking questions of the LLM (i.e. "update me on Project X"), or instructing it to summarize all emails pertaining to the topic. This prompts the service to retrieve relevant emails from a fake database.

The service comes equipped with several prompt injection defenses, and the attacker's goal is to bypass these and craft a creative prompt that will trick the model into doing or revealing things it is not trained to.

Both of these have become serious, real-life threats as organizations and developers build applications, AI assistants and chatbots, and other services on top of LLMs, allowing the models to interact directly with users' computers, summarize Slack chats, or screen job seekers before HR reviews their resumes, among all the other tasks that AIs are being trained to perform.

Microsoft has first-hand experience with what can go wrong should data thieves hijack an AI-based chatbot. Earlier this year, Redmond fixed a series of flaws in Copilot that allowed attackers to steal users' emails and other personal data by chaining together a series of LLM-specific attacks, beginning with prompt injection.

Author and red teamer Johann Rehberger, who disclosed these holes to Microsoft in January, had previously warned Redmond that Copilot was vulnerable to zero-click image rendering.

Some of the defenses built into the LLMail-Inject challenge's simulated email service include:

Plus, there's a variant in the challenge that stacks any or all of these defenses on top of each other, thus requiring the attacker to bypass all of them with a single prompt.

To participate, sign into the official challenge website using a GitHub account, and create a team (ranging from one to five members). The contest opens at 1100 UTC on December 9 and ends at 1159 UTC on January 20.

The sponsors will display a live scoreboard plus scoring details, and award $4,000 for the top team, $3,000 for second place, $2,000 for third, and $1,000 for the fourth-place team. ®

Send us news
12 Comments

Infosec experts divided on AI's potential to assist red teams

Yes, LLMs can do the heavy lifting. But good luck getting one to give evidence

AI's rising tide lifts all chips as AMD Instinct, cloudy silicon vie for a slice of Nvidia's pie

Analyst estimates show growing apetite for alternative infrastructure

US bipartisan group publishes laundry list of AI policy requests

Chair Jay Obernolte urges Congress to act – whether it will is another matter

Microsoft won't let customers opt out of passkey push

Enrollment invitations will continue until security improves

Don't fall for a mail asking for rapid Docusign action – it may be an Azure account hijack phish

Recent campaign targeted 20,000 folk across UK and Europe with this tactic, Unit 42 warns

Take a closer look at Nvidia's buy of Run.ai, European Commission told

Campaign groups, non-profit orgs urge action to prevent GPU maker tightening grip on AI industry

Boffins trick AI model into giving up its secrets

All it took to make an Google Edge TPU give up model hyperparameters was specific hardware, a novel attack technique … and several days

Open source maintainers are drowning in junk bug reports written by AI

Python security developer-in-residence decries use of bots that 'cannot understand code'

Microsoft teases Copilot Vision, the AI sidekick that judges your tabs

Edge-exclusive tool promises 'second set of eyes' for browsing

American cops are using AI to draft police reports, and the ACLU isn't happy

Do we really need to explain why this is a problem?

Are you better value for money than AI?

Tech vendors start saying the quiet part out loud – do enterprises really need all that headcount?

Microsoft Edge takes a victory lap with some high-looking usage stats for 2024

Lots of big numbers, but market share wasn't one of them