Anatomy of an AI Data Leak: 3 Real Telemetry Stories

Three incidents, three lessons, zero customer names

The most useful thing we can publish is not a benchmark and not a whitepaper. It is what actually happens inside real organisations the week after they switch Zeuslock on. The three stories below are composite walkthroughs based on patterns we see repeatedly across customer telemetry. Industries, headcounts, regions and specifics have been changed enough that no individual organisation is identifiable. The detection logic and the human response are reported as they happened.

Each story follows the same shape: the setup (who was doing what), what we saw (the finding, with a redacted preview), what fired (which detector, severity, action), and what the customer did next (the policy or process change that closed the loop). The point is not to shame employees. In every case the employee was trying to do their job faster. The point is to show the gap between intent and exposure, and to show what a useful control actually looks like in the moment.

Story 1: an AKIA key, a stack trace, and an honest debugging session

A backend engineer at a Series-B fintech was three hours into a production incident. The deploy pipeline had wedged on a permissions error and the team was burning the European morning trying to ship a fix before market open. The engineer pulled the failing CloudWatch stack trace, opened Microsoft Copilot inside VS Code, and asked the obvious question: why is this failing, and what permission do I need to add to this IAM role?

The pasted block was about 60 lines long. Buried in line 14 was the literal AWS access key ID being used by the deploy runner, in the form AKIAIO******************, alongside the region, the role ARN, and enough of the surrounding environment to make the credential trivially usable if it ever left the building. The engineer had not copied a secret on purpose. They had copied a stack trace that happened to contain one.

The Zeuslock browser extension and the desktop agent both fired within milliseconds. The triggered detector was api_key:aws on the strong regex layer (the AKIA prefix plus the 20-character base32 body), confirmed in under 80 ms by the EU-hosted ML model with a high-confidence score. Severity: critical. Action under the customer's developer policy at that moment: Anonymize. The pasted text reached Copilot with the key replaced by a structurally valid fake, the rest of the trace untouched. Copilot still produced a perfectly useful answer about the missing s3:PutObject permission, because the model never needed the real key to reason about IAM.

The Operator Console recorded the incident with full attribution: user, host, source URL (github.com/copilot), and a salted hash of the raw secret so the security team could correlate against AWS CloudTrail without ever storing the secret itself. The security lead's response was twofold. First, the AKIA key was rotated within the hour as a hygiene measure, even though it had never actually left the device. Second, the developer policy was tightened from Anonymize to Block for the api_key:* family across all engineering groups, and a pre-commit gitleaks hook was added to every backend repository so the same class of leak would be caught one layer earlier, at git commit rather than at paste into AI tool.

Story 2: 200 lines of customer history, pasted into ChatGPT for “sentiment”

A senior support analyst at a mid-market online retailer was preparing a weekly escalation summary for the head of customer service. The standard internal tool was slow that day, so the analyst exported the last week of complaint threads from the CRM into a CSV, opened ChatGPT, pasted the 200-line block, and asked: summarise the top three complaints by theme and tell me which ones look angriest.

The block contained 47 customer email addresses, 31 postal addresses across France, Spain and Germany, the full order-line history for each complaint, and in 12 cases a partial card identifier of the form **** **** **** 4242. It was, from a GDPR standpoint, a fairly textbook unlawful onward transfer to a US sub-processor outside the customer's stated processing chain. The analyst was not malicious. They were faster with ChatGPT than they were with the internal tool, and nobody had ever told them otherwise in concrete terms.

Zeuslock fired three detectors on a single submission: email (47 hits, high confidence), address (31 hits, high confidence, locale-aware for FR / ES / DE), and credit_card_partial on the masked PAN fragments. Severity: high. Action under the support-team policy: Anonymize. The prompt that actually reached ChatGPT was rewritten on the fly: every email became user1@example.com, user2@example.com and so on, every address became a structurally valid but fake European address in the same country, every partial PAN became a Luhn-valid fake. The order-line text and the actual complaint sentences were untouched. ChatGPT still produced an entirely useful summary — the top three themes were delivery delays, sizing inconsistency on a specific product line, and a billing display bug — because none of that reasoning needed real personal data.

The customer made a small but high-leverage change. They did not block ChatGPT for the support team. They added a one-page section to the internal user guide titled “the safe pattern”, with a screenshot of the Zeuslock anonymized preview and the sentence: this is what your prompt actually looked like to the model, and the answer was still useful, so you can keep using this pattern. Behavioural compliance went up because the team saw, visually, that the protective behaviour cost them nothing.

The behaviour we want is not “stop using AI”. The behaviour we want is “use AI freely, on a payload that has been stripped of the things that should never have been in the payload in the first place”. Anonymization is the bridge.

Story 3: 400 lines of core IP, refactoring advice, and a desktop app

A senior engineer at a European healthtech company was overhauling a module deep inside the proprietary scoring engine that is, candidly, the entire reason the company has investors. The engineer opened the Claude desktop app and pasted a 400-line Python module to ask for refactoring suggestions, in particular how to flatten a nested set of conditionals into a more testable shape. The module contained the actual scoring weights, the function names that map directly to the patented methodology, and the internal package prefix that only exists inside that company.

This is the hardest class of leak to catch. There is no regex for “our intellectual property”. The Zeuslock detection that fired here was a layered one: a generic source_code heuristic flagged the submission as a large block of Python with high syntactic density and low natural-language ratio, and a custom detector the customer had built three weeks earlier — matching their internal package prefix and three reserved function names — raised the severity from medium to critical. Action under the IP-owning team's policy: Block. The desktop agent intercepted the submission before it left the machine. Claude received nothing. The engineer received a modal explaining what was matched, a link to the internal policy page, and a one-click path to request an exception through the security team if the refactoring was genuinely safe to do externally.

The follow-up was the most operationally interesting of the three. Rather than tighten the policy for everyone, the customer split the developer policy into two profiles. Teams that touch the scoring engine, the patent code paths, and the model-training pipeline run under a strict profile: source_code and the custom IP detector are both set to Block. Teams that ship utility services, internal tooling, and the public marketing site run under a looser profile: source_code stays at Anonymize so they can keep getting refactoring help on code where the cost of an external paste is essentially zero. The split was published as a one-paragraph policy note and made enforceable through the existing SCIM groups, so no engineer had to remember which mode they were in — the agent knew.

What these three have in common

Read together, the three stories rhyme more than they differ. In every case the employee was being productive, not careless. In every case the secret or the personal data was incidental to the question being asked — nobody pasted a key in order to leak a key, they pasted a key because it was attached to the error message they actually wanted help with. In every case the model would have produced the same useful answer on an anonymized or empty payload. And in every case the durable fix was not a memo. It was a policy change wired into the agent, supported by a small process change wired into the team.

If you want to put numbers on what to do this quarter:

Move the api_key:* family to Block for engineering groups. The false-positive rate on AWS, Stripe, GitHub, Slack and OpenAI key patterns is low enough that Block is appropriate from week three. Pair it with a pre-commit hook in the repos that matter.
Keep PII at Anonymize for non-engineering teams, and document the safe pattern. Show the redacted preview. Show the model still being useful. Behavioural change happens when the protective behaviour visibly costs nothing.
Build at least one custom detector for your own IP. Pick the internal package prefix, the patent identifiers, the reserved function names. Backtest it against the last 30 days of incidents in the Operator Console before you flip it to Block. The detector you build in an afternoon will catch a leak that no generic vendor regex ever would.
Split the developer policy by team, not by tool. The IP-owning teams need a different profile than the utility teams. Wire it through your existing identity groups so it stays enforced automatically as people join, move and leave.

None of this requires a new tool, a new process, or a board memo. It requires looking at the three rows in your incidents dashboard that scared you most this month, and being honest about which policy mode would have stopped them. For the operator-side detail, see the configuring detection policies guide and the custom detectors reference. The three stories above started in the same place yours did. They ended with a policy change small enough to ship on a Friday.

The Anatomy of an AI Data Leak: Three Stories from Our Telemetry