Skip to main content

Digital Trends may earn a commission when you buy through links on our site. Why trust us?

A dangerous new jailbreak for AI chatbots was just discovered

the side of a Microsoft building
Wikimedia Commons

Microsoft has released more details about a troubling new generative AI jailbreak technique it has discovered, called “Skeleton Key.” Using this prompt injection method, malicious users can effectively bypass a chatbot’s safety guardrails, the security features that keeps ChatGPT from going full Taye.

Skeleton Key is an example of a prompt injection or prompt engineering attack. It’s a multi-turn strategy designed to essentially convince an AI model to ignore its ingrained safety guardrails, “[causing] the system to violate its operators’ policies, make decisions unduly influenced by a user, or execute malicious instructions,” Mark Russinovich, CTO of Microsoft Azure, wrote in the announcement.

It could also be tricked into revealing harmful or dangerous information — say, how to build improvised nail bombs or the most efficient method of dismembering a corpse.

an example of a skeleton key attack
Microsoft

The attack works by first asking the model to augment its guardrails, rather than outright change them, and issue warnings in response to forbidden requests, rather than outright refusing them. Once the jailbreak is accepted successfully, the system will acknowledge the update to its guardrails and will follow the user’s instructions to produce any content requested, regardless of topic. The research team successfully tested this exploit across a variety of subjects including explosives, bioweapons, politics, racism, drugs, self-harm, graphic sex, and violence.

While malicious actors might be able to get the system to say naughty things, Russinovich was quick to point out that there are limits to what sort of access attackers can actually achieve using this technique. “Like all jailbreaks, the impact can be understood as narrowing the gap between what the model is capable of doing (given the user credentials, etc.) and what it is willing to do,” he explained. “As this is an attack on the model itself, it does not impute other risks on the AI system, such as permitting access to another user’s data, taking control of the system, or exfiltrating data.”

As part of its study, Microsoft researchers tested the Skeleton Key technique on a variety of leading AI models including Meta’s Llama3-70b-instruct, Google’s Gemini Pro, OpenAI’s GPT-3.5 Turbo and GPT-4, Mistral Large, Anthropic’s Claude 3 Opus, and Cohere Commander R Plus. The research team has already disclosed the vulnerability to those developers and has implemented Prompt Shields to detect and block this jailbreak in its Azure-managed AI models, including Copilot.

Andrew Tarantola
Andrew has spent more than a decade reporting on emerging technologies ranging from robotics and machine learning to space…
This Alienware gaming PC with RTX 4080 is $700 off at Dell
Alienware Aurora R16 sitting on a coffee table.

If you’ve been scrounging around for gaming PC deals, we found an amazing Dell promotion that most folks are going to love: While the sale lasts, you’ll be able to purchase the Alienware Aurora R16 Gaming Desktop for just $2,500. While that’s still a healthy chunk of change, but consider that this powerhouse PC normally goes for $3,200. That’s a $700 markdown on one of the strongest PC gaming towers for 2024.

Why you should buy the Alienware Aurora R16 
A workhorse PC must have a strong processor and graphics card working behind the scenes at all times; especially when the PC is going to be used exclusively for gaming. In the case of the Aurora R16, the desktop is equipped with an Intel Core i9-14900KF that delivers max clock speeds of 6.0GHz. And in terms of GPU, we’re working with the NVIDIA GeForce RTX 4080. When it comes to blistering-fast performance and lag-free gaming, the 4090 is the only further step you can take.

Read more
Dell cut the price of this XPS 15 laptop by $400 today
Dell XPS 15 front view showing display.

We’re always on the lookout for laptop deals, and we unearthed a promotion that’s only good for a limited time. Dell has chopped the price of one of its most popular portable workstations, the Dell XPS 15 Laptop. Usually, this Windows PC goes for $1,500, but you’ll be able to take it home for just $1,100. That’s an awesome $400 discount! 

Why you should buy the Dell XPS 15 
The XPS 15 is one of the most popular laptops on the market, right up there with the Apple MacBook Pro (we found some great MacBook deals this week too). This particular version of the computer is sold with an Intel Core i7-13620H with integrated Intel Arc Graphics A370M. These core peripherals are powered by 16GB of RAM and 1TB of internal storage. On paper alone, this is an incredible laptop, but it’s the performance that’s going to win you over. 

Read more
Best router deals: Save on mesh networks and Wi-Fi 6 routers
The Netgear Nighthawk AXE11000 Tri-Band Wi-Fi 6E Router on a table.

If you have a router from several years ago, then you're missing out on a lot of the new upgrades that have come out on modern routers that may make it a worthwhile upgrade. For example, the newer Wi-Fi 6 and Wi-Fi 6E standards are made to account for several devices connecting wirelessly at the same time, so you aren't having connectivity or lag issues. Not only that but mesh routers are a great way to deal with dead spots in and around your home that can be very frustrating.
Of course, there are a lot of routers to pick from out there, and if you don't have a lot of tech-savvy, it can be overwhelming. That's why we've gone out and found our favorite router deals that will give you the best bang for your buck, and that includes mesh router deals too. So if you've just picked up some smart home devices from Google Nest deals or Amazon Echo deals, or you need to lower your ping so you can enjoy gaming PC deals and gaming console deals to their fullest, check out the options below.

Best Router Deals
TP-Link Archer AX3000 -- $101, was $130

Read more