‹ learn
MCP concepts

OWASP MCP Top 10

The OWASP MCP Top 10 is a threat taxonomy that catalogs the most common, highest-impact security risks specific to Model Context Protocol (MCP) servers — including tool poisoning, hardcoded secrets in tool schemas, command and SQL injection, unsafe destructive tools, the lethal trifecta, and silent tool rug-pulls. It gives developers building or evaluating MCP servers a shared checklist for what to look for before they trust a server with an agent. CheckMCP runs an OWASP MCP Top 10 pass on every audit and folds it into Security, the top-weighted pillar (20 of 100 points) of its MCP Score.

What the OWASP MCP Top 10 is (and why it exists)

The OWASP MCP Top 10 is a threat taxonomy: a structured list of the recurring security weaknesses that show up specifically in MCP servers. MCP is an open JSON-RPC 2.0 protocol, introduced by Anthropic, that lets an AI host run one MCP client per server, perform a capability handshake, and then discover and call that server's tools, resources and prompts. That power is exactly what makes a server a security surface — its tool descriptions are read into the model's context, and its tool outputs are read back as data, so an untrusted server has two channels into the agent.

Classic application-security taxonomies like the OWASP Web Top 10 don't capture this. The risks here are agent-shaped: text the model treats as instructions, capabilities that combine into an exploit, and definitions that change after you approved them. A dedicated MCP taxonomy gives developers a common vocabulary so 'this server is risky' becomes a specific, checkable claim instead of a vibe.

Treat the list as a checklist, not a ranking to memorize. The goal is coverage: before you connect a third-party MCP server to an agent — or ship your own — you want to have reasoned about each category at least once.

The threat categories at a glance

The categories cluster into a few themes. Injection of instructions covers tool poisoning (hidden agent-directed instructions in a tool's name, description, parameter schema, defaults, examples or output schema) and tool-output prompt injection (the same kind of payload delivered at call time in what a tool returns, including content the server merely relayed from a web page or email).

Secrets and unsafe execution covers hardcoded secrets — API keys, tokens or credentials baked into a tool schema or example where the agent (and anyone reading the tool list) can see them — and classic command, SQL or path injection, where a tool passes caller-controlled input into a shell, query or filesystem without sanitization.

Dangerous capabilities covers destructive tools that act (delete, overwrite, transfer) without a confirmation step or a destructiveHint annotation, and the lethal trifecta: one server combining untrusted-content ingestion, sensitive-data access, and an exfiltration-or-destruction path, so a single prompt injection can turn into a real breach.

Trust over time covers the rug-pull and silent tool drift — a server you already approved quietly changing its tool definitions afterward, because agents re-read the tools/list result each session rather than pinning a reviewed copy. Rounding out the list are protocol and compliance gaps: a stale protocol version, malformed JSON-RPC 2.0 errors, missing tool annotations, or weak OAuth/authorization discovery that make a server harder to trust and integrate safely.

How to stay safe as a developer

If you are building an MCP server: keep secrets out of every schema, default, example and description — load them from the environment at call time instead. Sanitize and parameterize any input that reaches a shell, database or filesystem. Mark consequential tools with a destructiveHint annotation and require explicit confirmation before they act. And do not bundle a content-fetching tool, a secrets-or-data tool, and an outbound or destructive tool into the same server, which hands an agent all three legs of the lethal trifecta at once.

If you are evaluating a server before connecting it: read the full tool list — not just names, but descriptions, parameter defaults, examples and output schemas, since poisoning hides in the places a UI never shows. Inventory which capability legs each server contributes, and remember the trifecta is additive across servers loaded into one agent, not just within a single server. Pin a reviewed tool set and re-check on every release, because a one-time audit only certifies the server as it was at probe time.

Above all, design for capability separation rather than relying on the model to resist injection. No current model reliably distinguishes trusted instructions from injected ones, so the durable defenses are breaking at least one leg of the trifecta, gating destructive actions behind human confirmation, and re-auditing whenever the server can change underneath you.

Static surface vs. runtime behavior

Some of these risks are visible in the published tool list and can be caught by reading it: tool poisoning shipped in metadata, secrets in schemas, a destructive tool with no confirmation, and a lethal-trifecta capability mix. This is static analysis — fast, safe, and side-effect-free, because it never calls anything.

Other variants only appear when a tool actually runs: tool-output prompt injection, command or SQL injection that fires on a specific input, and exfiltration that only happens at call time. Catching those needs a behavioral test that invokes the server — and a responsible one exercises only read-only-safe tools with benign canary inputs, never mutating tools, so probing for a vulnerability cannot trigger the very damage it is looking for.

Because rug-pulls and drift are changes, they are inherently temporal: you cannot see them in a single snapshot, only by comparing a current probe against a stored baseline. Full coverage of the OWASP MCP Top 10 therefore spans three modes — a static scan of the definitions, an optional behavioral probe of live responses, and continuous re-probing over time.

How CheckMCP handles it

CheckMCP operationalizes the OWASP MCP Top 10 inside Security, the top-weighted pillar of its MCP Score, worth 20 of 100 points. On every audit, the static analyzer scans each tool's name, description, parameter schema (names, descriptions, defaults, examples) and output schema and raises findings by category: a hardcoded secret value (MCP01), a destructive tool missing a confirmation or destructiveHint (MCP02), an injected instruction / tool poisoning (MCP03), an execution tool with an unconstrained free-string parameter that enables command or shell injection (MCP05), and the lethal trifecta — untrusted-content ingestion plus sensitive-data access plus an exfil-or-destruction path on one server (MCP06), among others. Categorical failures trip hard floors: a hardcoded secret in a tool schema, a critical injection, or a confirmed lethal trifecta caps the grade at D and flags the report SECURITY_RISK no matter how clean the rest is, while a failed MCP handshake caps the grade at F — so a serious security flaw cannot be bought back with polish elsewhere. CheckMCP's opt-in behavioral evals add the runtime layer: read-only-safe tools are exercised with benign canary inputs and the responses inspected for tool-response poisoning, exfiltration vectors, and leaked secrets — with a planted callback-canary URL that, if the server fetches it, confirms exfiltration outright — and it never invokes mutating tools. Drift monitoring re-probes tracked servers and re-runs the same OWASP pass against whatever the server now returns, catching rug-pulls and silent tool changes. You can run the pass with the open-source CLI (uvx checkmcp <url>), the web app at checkmcp.dev, the GitHub Action (uses: H129hj/checkmcp@v1) to fail CI on a score regression or rug-pull, or the in-band Gateway that blocks tool-poisoning and drift before it reaches your agent in passive or active mode.

OWASP MCP Top 10 — FAQ

What is the OWASP MCP Top 10?+
It is a threat taxonomy of the most common and highest-impact security risks specific to Model Context Protocol servers — including tool poisoning, hardcoded secrets in schemas, command and SQL injection, unsafe destructive tools, the lethal trifecta, rug-pulls / silent tool drift, and protocol-compliance gaps. It gives developers a shared checklist for what to verify before trusting an MCP server with an agent.
How is the OWASP MCP Top 10 different from the OWASP Web Top 10?+
The web list targets traditional app vulnerabilities (broken access control, injection into a server, and so on). The MCP list targets agent-shaped risks: text in tool metadata or outputs that the model reads as instructions, capabilities that combine across tools into an exploit (the lethal trifecta), and tool definitions that change after approval (rug-pulls). Some categories overlap — classic injection still applies — but the threat model is the AI agent, not just the server.
What are the main categories in the MCP threat taxonomy?+
They group into instruction injection (tool poisoning and tool-output prompt injection), secrets and unsafe execution (hardcoded credentials in schemas, command/SQL/path injection), dangerous capabilities (destructive tools without confirmation, and the lethal trifecta), trust over time (rug-pulls and silent tool drift), and protocol/compliance gaps (stale protocol version, malformed JSON-RPC 2.0 errors, missing annotations, weak OAuth discovery).
How do I check an MCP server against the OWASP MCP Top 10?+
Combine three modes. Statically read the full tool list — descriptions, parameter defaults, examples and output schemas — for poisoning, secrets and risky capability mixes. Behaviorally probe live responses, exercising only read-only-safe tools with benign canary inputs, to catch runtime injection and exfiltration. And re-check over time against a baseline to catch rug-pulls. CheckMCP automates all three: a static OWASP pass on every audit, opt-in behavioral evals, and drift monitoring.
Does CheckMCP test for the OWASP MCP Top 10?+
Yes. The Security pillar (weighted 20 of 100 in the MCP Score) runs an OWASP MCP Top 10 pass on every audit, flagging categories such as hardcoded secrets (MCP01), unsafe destructive tools (MCP02), tool poisoning (MCP03), command injection (MCP05) and the lethal trifecta (MCP06). A hardcoded secret in a schema, a critical injection, or a confirmed trifecta caps the grade at D; a failed handshake caps it at F. Opt-in behavioral evals add a runtime layer and monitoring catches rug-pulls.
Why isn't a one-time scan enough to cover the taxonomy?+
Several categories only appear over time or at call time. Rug-pulls and silent tool drift are changes you can only see by comparing a new probe to a stored baseline, and tool-output injection or command injection may only fire on a specific runtime input. A single static snapshot certifies the server as it was at probe time, which is why full coverage needs behavioral probing plus continuous re-auditing.

Related