A global sensation since its initial release at the end of last year, ChatGPT‘s popularity among consumers and IT professionals alike has stirred up cybersecurity nightmares about how it can be used to exploit system vulnerabilities. A key problem, cybersecurity experts have demonstrated, is the ability of ChatGPT and other large language models (LLMs) to generate polymorphic, or mutating, code to evade endpoint detection and response (EDR) systems.
A recent series of proof-of-concept attacks show how a benign-seeming executable file can be crafted such that at every runtime, it makes an API call to ChatGPT. Rather than just reproduce examples of already-written code snippets, ChatGPT can be prompted to generate dynamic, mutating versions of malicious code at each call, making the resulting vulnerability exploits difficult to detect by cybersecurity tools.
โChatGPT lowers the bar for hackers, malicious actors that use AI models can be considered the modern ‘Script Kiddies’,โ said Mackenzie Jackson, developer advocate at cybersecurity company GitGuardian. โThe malware ChatGPT can be tricked into producing is far from ground-breaking but as the models get better, consume more sample data and different products come onto the market, AI may end up creating malware that can only be detected by other AI systems for defense. What side will win at this game is anyone’s guess.โ
There have been various proof of concepts that showcase the tool’s potential to exploit its capabilities in developing advanced and polymorphic malware.
Prompts bypass filters to create malicious code
ChatGPT and other LLMs have content filters that prohibit them from obeying commands, or prompts, to generate harmful content, such as malicious code. But content filters can be bypassed.
Almost all the reported exploits that can potentially be done through ChatGPT are achieved through what is being called as “prompt engineering,” the practice of modifying the input prompts to bypass the toolโs content filters and retrieve a desired output. Early users found, for example, that they could get ChatGPT to create content that it was not supposed to create โ “jailbreaking” the program โ by framing prompts as hypotheticals, for example asking it to do something as if it were not an AI but a malicious person intent on doing harm.
“ChatGPT has enacted a few restrictions on the system, such as filters which limit the scope of answers ChatGPT will provide by assessing the context of the question,โ said Andrew Josephides, director of security research at KSOC, a cybersecurity company specializing in Kubernetes. โIf you were to ask ChatGPT to write you a malicious code, it would deny the request. If you were to ask ChatGPT to write code which can do the effective function of the malicious code you intend to write, however ChatGPT is likely to build that code for you.โ
With each update, ChatGPT gets harder to trick into being malicious, but as different models and products enter the market we cannot rely on content filters to prevent LLMs from being used for malicious purposes, Josephides said.
The ability to trick ChatGPT into utilizing things it knows but which are walled behind filters is what can cause users to make it generate effective malicious code. It can be used to render the code polymorphic by leveraging the toolโs capability to modify and finetune results for the same query if run multiple times.
For instance an apparently harmless Python executable can generate a query to send to the ChatGPT API for processing a different version of malicious code each time the executable is run. This way, the malicious action is performed outside of the exec() function. This technique can be used to form a mutating, polymorphic malware program that is difficult to detect by threat scanners. By Shweta Sharma