The Hidden Threat in MCP Servers: Why MCP Needs Its Own DLP Layer
Model Context Protocol is the new wire format between agents and tools. Browser DLP cannot see it. Here is what an MCP-aware DLP layer actually looks like.
Eighteen months from a spec to a shadow IT crisis
Anthropic published the first Model Context Protocol spec in November 2024. Eighteen months later MCP is the default plumbing between agents and tools in Claude Desktop, Cursor, Claude Code, Continue, Zed, Windsurf, the Copilot agent panel, and a growing list of enterprise IDE plugins. The community marketplace at mcp.so lists more than two thousand servers covering Notion, Linear, GitHub, Postgres, Snowflake, Stripe, Jira, S3, Kubernetes, internal wikis and bespoke vendor APIs. Every one of those servers can read data on behalf of a model.
Almost none of them sit behind a DLP gate.
Browser extensions catch the ChatGPT tab. Desktop agents catch the Claude desktop app. Neither one sees what flows over a stdio pipe between Claude Desktop and the Postgres MCP server you installed last week. That blind spot is what this post is about.
The 30-second MCP primer
MCP is a JSON-RPC 2.0 protocol with two transports: stdio (the model host spawns the server as a subprocess and talks over standard input and output) and SSE (server-sent events over HTTPS, used for remote servers). Claude Desktop, Cursor, and most local IDE integrations default to stdio. Enterprise deployments are starting to use SSE behind an internal gateway.
The vocabulary is small. After the initialize handshake you mostly see four methods.
tools/list— the server advertises the tools it offers and the JSON schema of each argument.tools/call— the model invokes a tool with arguments and the server returns a structured result.resources/listandresources/read— the server exposes addressable content (a file, a database row, a Confluence page) that the model can pull into its context.resources/subscribe— the model asks to be notified when a resource changes; every update streams into the next turn.
That is essentially the wire. Simple, well-specified, easy to parse — which is exactly why a DLP layer for it is feasible at all.
Four places sensitive data leaks in an MCP exchange
Every leak path you already worry about in a browser chat exists in MCP, plus two new ones the host generates without anyone typing a key.
1. Tool inputs
When a user prompts "summarise the latest deal in HubSpot for Acme," the agent builds a tools/call with the literal company name, the deal ID, and often the user's email or session token as arguments. Those arguments are user-supplied content in everything but appearance. A jailbreak prompt that smuggles a customer's NIR into the query argument of a Postgres tool is functionally identical to typing the NIR into ChatGPT.
2. Tool outputs
This is the new surface, and it is the one most teams underestimate. A resources/read on a Confluence page returns whatever is on that page — including the api_key someone pasted into a runbook three years ago. A SQL tool returns rows that include emails, IBANs, NIR numbers, passport scans encoded as data URIs. The model then takes that output, summarises it, and the summary travels back across the network to Anthropic, OpenAI, Google or Mistral. The user never typed any of it.
3. Resource subscriptions
An IDE agent subscribed to file:///home/dev/secrets.env for hot-reload during a debug session will stream every save event into the model's context. Long-lived subscriptions are particularly dangerous because they decouple the data movement from the user's attention — nobody is watching the conversation at 02:14 when the secrets file rotates.
4. Accumulated host context
The host (Claude Desktop, Cursor) is responsible for assembling the prompt that goes to the model. It folds in previous tool outputs, resource snapshots, and conversation history. Over a long session that context window quietly accumulates a slice of every system the user touched. Most hosts ship that whole context with every turn. If you only inspect the user's typed prompt, you miss 90% of what actually traverses the wire.
Why generic DLP misses all of this
Three reasons, all structural.
No HTTP intercept point. Browser DLP relies on hooking fetch in the page context. A stdio MCP server has no page, no browser, no fetch. The bytes move through an anonymous pipe between two processes the OS treats as siblings. Web-proxy DLP, CASB, SWG — none of them see the conversation.
SSE traffic does not look like chat. Even when MCP runs over HTTPS, the framing is text/event-stream with JSON-RPC envelopes, not the chat-completion shapes that DLP vendors have built signature libraries for. Generic content inspection treats the stream as opaque telemetry.
The sensitive payload arrives from the server. Classical DLP is outbound: it watches what the user sends. MCP inverts the threat. The high-risk content is the tool's response, which arrives at the host, lands in the model's context, and then leaves the device on the next outbound LLM call. By the time it crosses the network boundary, it is already laundered through a model summary and may not match any literal pattern.
Inspect the response, not just the request. An MCP-aware DLP that only scans tools/call arguments catches half the problem. The other half is the result field on the way back.
What an MCP-aware DLP layer actually does
The component is small. It sits between the host process and the MCP server and does four things.
- Parse JSON-RPC envelopes. Recognise
tools/callrequests,tools/callresponses,resources/readresponses, and subscription notifications. Discard everything else with negligible overhead. - Apply the same finding-type detection used elsewhere. The detectors for
api_key,password,jwt,private_key,credit_card,IBAN,email,phone,SSN,NIR,passport,connection_string, and source code are the same ones the browser extension uses. There is no second taxonomy to maintain. - Act per finding type. Monitor, Anonymize with format-preserving redaction, or Block — the policy matrix is identical to the browser and desktop policy matrix. A SOC analyst who already understands the Operator Console understands MCP incidents without retraining.
- Emit the same incident envelope. Every detection becomes an incident with the same shape, the same severity model, the same webhook payload to Slack, Splunk HEC, or Microsoft Sentinel. Investigators stay in one tool.
A 40-line Python wrapper that scans both directions
Concretely, this is what a stdio wrapper looks like. It launches the real MCP server as a subprocess and pipes every message through a content scanner before relaying it.
import asyncio, json, sys
from zeuslock_mcp import Scanner, Policy
scanner = Scanner(policy=Policy.from_file("policy.yaml"))
async def pipe(reader, writer, direction):
while True:
line = await reader.readline()
if not line:
break
try:
msg = json.loads(line)
except json.JSONDecodeError:
writer.write(line); await writer.drain(); continue
verdict = scanner.inspect(msg, direction=direction)
if verdict.action == "block":
err = {"jsonrpc": "2.0", "id": msg.get("id"),
"error": {"code": -32001, "message": verdict.reason}}
writer.write((json.dumps(err) + "\n").encode())
await writer.drain(); continue
writer.write((json.dumps(verdict.payload) + "\n").encode())
await writer.drain()
async def main(cmd):
proc = await asyncio.create_subprocess_exec(
*cmd, stdin=asyncio.subprocess.PIPE,
stdout=asyncio.subprocess.PIPE)
loop = asyncio.get_event_loop()
stdin_reader = asyncio.StreamReader()
await loop.connect_read_pipe(
lambda: asyncio.StreamReaderProtocol(stdin_reader), sys.stdin)
stdout_writer = sys.stdout.buffer
await asyncio.gather(
pipe(stdin_reader, proc.stdin, direction="to_server"),
pipe(proc.stdout, stdout_writer, direction="to_host"))
asyncio.run(main(sys.argv[1:]))
The user points Claude Desktop at zeuslock-mcp postgres-mcp-server --dsn ... instead of postgres-mcp-server --dsn .... Same UX, full visibility, redaction or block decisions made before the bytes leave the local machine.
The supply chain problem nobody is talking about yet
The community MCP marketplace has the same characteristics as npm in 2015 and the VS Code marketplace in 2019: low publishing friction, no signing requirement, no provenance, install instructions that paste a stranger's binary into the user's home directory. A malicious MCP server can do everything a normal server does — read files, run queries, hit APIs — plus quietly exfiltrate the model's context window through a side-channel tool call.
The CNIL, the BSI, and ENISA have not yet published guidance specifically on MCP. They will. In the meantime, treat every third-party MCP server as you would treat an unsigned browser extension or a random VS Code plugin: a piece of code that runs with your privileges and can see whatever your agent sees.
Defence checklist for the next 90 days
- Inventory. Ask every engineer which MCP servers are configured in their Claude Desktop, Cursor, and Claude Code. The list will surprise you.
- Vet. For each server, identify the publisher, read the source if available, and pin the version. Block install of unsigned servers from the marketplace by default.
- Wrap. Put an MCP-aware DLP scanner between the host and every server. Scan both directions. Start in Monitor mode for two weeks to learn the baseline.
- Allowlist tool calls per role. Developers may need
filesystem/write; finance does not. The tool name is exactly the policy primitive you want. - Inspect resource outputs. A scan that only looks at
tools/callarguments and ignoresresources/readresponses misses the largest category of leaks. - Log every
tools/callfor audit. Tool name, arguments hash, result size, finding types detected, action taken. Ship to your SIEM with HMAC-signed webhooks. - Cover SSE remote servers too. If you let a remote MCP server connect over HTTPS, terminate the SSE stream at a gateway under your control and run the same scanner there.
- Map the controls to your frameworks. MCP scanning slots into NIS2 risk management, DORA ICT third-party risk, and EU AI Act Article 15 logging obligations. Document the mapping now so audit does not catch you flat-footed.
Why this matters in 2026, not 2028
MCP is the next email-attachment threat surface for AI agents. Email attachments took a decade and several headline-grade incidents — ILOVEYOU, the Slammer worm, Locky — before enterprises put real inspection in front of them. We do not have a decade this time. Agentic IDEs are already touching production databases on behalf of every developer who installed the wrong server from a marketplace card.
The orgs that win the next eighteen months will be the ones that treated MCP as a first-class DLP surface before the first big breach hits the news. Wrap your servers. Scan both directions. Log everything. The pattern is well understood; the only question is whether you deploy it before or after the post-mortem.
If you want the production-grade version of the wrapper above, the Zeuslock MCP integration is documented at zeuslock.ai/docs, and you can request a demo from the same site.
Protect your data from AI leaks
Try Zeuslock free — DLP for ChatGPT, Claude, Gemini and more.
Book a demo →