Microsoft researchers disclosed a now-patched vulnerability in Anthropic’s Claude Code GitHub Action that could allow attackers to expose credentials stored in software development pipelines by manipulating the AI agent through malicious GitHub content. The issue relates to prompt injection attacks against AI coding agents used in CI/CD workflows that can access API keys, cloud credentials, and other sensitive information. Anthropic patched the flaw on May 5 with Claude Code version 2.1.128 after Microsoft disclosed the vulnerability through HackerOne on April 29.
Prompt injection vulnerabilities in AI coding agents can cause the agent to execute or reveal content embedded in developer-facing GitHub artifacts. Microsoft researchers found such a vulnerability in Anthropic’s Claude Code GitHub Action that allowed attackers to manipulate the AI agent through malicious GitHub content, enabling the agent to access files containing sensitive credentials including API keys and cloud credentials. The issue specifically involves prompt injection attacks within CI/CD workflows where the AI agent processes repository data and external content. The vulnerability affected how Claude Code handled instructions and responses embedded in issues, pull requests, or comments.
Attackers could hide prompt injection payloads in GitHub issues, pull requests, or comments to cause Claude Code to read credential files and to alter credential contents to evade built-in safeguards and GitHub secret-scanning tools. Microsoft demonstrated this by creating a GitHub workflow and disguising malicious instructions behind content hosted on a domain it controlled to bypass Claude’s safety protections. After Claude read credentials, an attacker could reconstruct them and exfiltrate the secrets via issue comments, workflow logs, web requests, or shell commands. Microsoft found that despite multiple layers of built-in security controls, a determined attacker could potentially manipulate an AI agent into exposing sensitive information.
Microsoft researchers created a GitHub workflow and disguised malicious instructions behind content hosted on a domain they controlled to bypass Claude’s safety protections. The exploit used prompt injection payloads hidden in GitHub issues, pull requests, or comments to cause Claude Code to read files containing sensitive credentials, including API keys and cloud credentials. The prompt injection tricked Claude into reading sensitive credentials and altering them to evade both Claude’s safeguards and GitHub’s secret-scanning tools. After Claude read the credentials, an attacker could reconstruct the credential and exfiltrate it through issue comments, workflow logs, web requests, or shell commands.
“To bypass Sonnet’s refusal safety mechanisms, we obscured the shell payload behind a response from our controlled domain,” the firm said. “We also enabled the workflow to be triggered by users with no ‘write’ permissions to ensure Anthropic’s environment variables scrub mitigations were active during our tests.” Microsoft reported those steps as part of their test workflow implementation.
The vulnerability could enable attackers to reconstruct and exfiltrate credentials used in CI/CD workflows. Attackers could cause Claude Code to read files containing API keys, cloud credentials, and other sensitive information by hiding prompt injection payloads in GitHub issues, pull requests, or comments. After reading credentials, attackers could alter them to evade Claude’s safeguards and GitHub secret-scanning tools. Exfiltration channels included issue comments, workflow logs, web requests, or shell commands.
Anthropic patched the flaw on May 5 with Claude Code version 2.1.128 after Microsoft disclosed the vulnerability through HackerOne on April 29. Microsoft demonstrated bypass methods by creating a GitHub workflow that used content hosted on a domain it controlled to disguise malicious instructions. Microsoft said it obscured the shell payload behind a response from that controlled domain to bypass Sonnet’s refusal safety mechanisms. Microsoft also enabled the workflow to be triggered by users with no ‘write’ permissions to ensure Anthropic’s environment variables scrub mitigations were active during their tests.
Despite multiple layers of built-in security controls, Microsoft found that a determined attacker could potentially manipulate an AI agent into exposing sensitive information. Claude Code was launched in October.
Microsoft researchers disclosed a prompt-injection vulnerability in Anthropic’s Claude Code GitHub Action that could be exploited through malicious content in GitHub issues, pull requests, or comments to cause the AI agent to access files containing API keys, cloud credentials, and other sensitive information.
Microsoft demonstrated an exploit involving a GitHub workflow that used content hosted on a domain it controlled to bypass Claude’s safety mechanisms and GitHub secret-scanning, and Anthropic patched the flaw in Claude Code version 2.1.128 on May 5 after Microsoft reported it on April 29.


