‹ learn
MCP concepts

MCP security

MCP security is about making sure the servers an AI agent connects to cannot hijack it, leak secrets, or be turned into an exploit. Because a server's tool descriptions and outputs flow directly into the model's context, an untrusted MCP server is an attack surface — which CheckMCP audits with an OWASP MCP Top 10 pass plus optional runtime checks.

Why MCP is a new attack surface

An agent trusts an MCP server twice: it reads the server's tool definitions into its context, and it reads tool outputs back as data. Both channels carry text the model may treat as instructions. A malicious or compromised server can exploit that trust without the user ever seeing the payload.

The risk is amplified when an agent loads multiple third-party servers: capabilities combine, and a single server (or a combination) can end up able to read secrets, ingest untrusted content, and send data out at once.

The main MCP risks

Recurring categories — tracked as an OWASP MCP Top 10 — include tool poisoning (hidden instructions in tool metadata or output), hardcoded secrets exposed in schemas, destructive tools that act without confirmation, the lethal trifecta (untrusted content + sensitive data + an exfiltration path on one server), rug pulls (a trusted server silently changing its tools), and protocol/compliance gaps.

Some of these are static (visible in the published tool list); others are runtime (only visible when a tool is actually invoked), so robust auditing needs both a static scan and an optional behavioral probe.

Static vs. runtime detection

Static analysis reads the server's tools, schemas and protocol behavior without side effects — fast and safe, and enough to catch poisoning shipped in the tool list, secrets in schemas, and risky capability combinations.

Runtime analysis invokes read-only-safe tools with benign canary inputs and inspects the responses for injection, exfiltration and leaked secrets — catching the output-delivered attacks a static scan cannot see. It must never call mutating tools.

How CheckMCP handles it

Security is the top-weighted of CheckMCP's seven pillars (weight 20/100). The static audit in security.py runs an OWASP MCP Top 10 pass — flagging hardcoded secret values (MCP01), destructive tools missing a confirmation or destructiveHint (MCP02), injected instructions in descriptions/schemas/outputs (MCP03), and the lethal-trifecta capability combination (MCP06), among others. A hardcoded secret or a critical injection (or a confirmed trifecta) trips a hard floor that caps the MCP Score at 69 (grade D); a failed handshake caps it at F. CheckMCP's opt-in behavioral evals add a runtime layer, and continuous monitoring re-checks tracked servers for drift and rug pulls.

MCP security — FAQ

What is the OWASP MCP Top 10?+
A categorization of the most common MCP-specific security risks — tool poisoning, hardcoded secrets, unsafe destructive tools, the lethal trifecta, rug pulls, and related protocol issues. CheckMCP's security pillar runs a pass over these categories on every audit.
Can a third-party MCP server compromise my agent?+
Yes. A server's tool descriptions and outputs are read into the model's context, so a malicious server can plant instructions that steer the agent — and if it can also reach sensitive data and exfiltrate, an injection becomes a breach. Auditing the server before trusting it is the defense.
How do I secure my own MCP server?+
Keep secrets out of schemas and examples, require explicit confirmation (and a destructiveHint) on destructive tools, avoid bundling untrusted-content, sensitive-data and outbound capabilities in one server, and re-audit on every release. CheckMCP scores each of these and tells you what to fix.
Does CheckMCP test for these risks?+
Yes — statically on every audit (OWASP MCP Top 10 in the security pillar, with hard floors for secrets, critical injection and the lethal trifecta), optionally at runtime via behavioral evals, and continuously via monitoring that catches rug-pulls and tool drift.

Related