Definition
Anonymisation (in DLP)
In an AI DLP context, anonymisation is the act of substituting a detected sensitive substring (a credit card, an IBAN, an email, an API key) with a structurally similar but non-sensitive placeholder before the prompt is sent to the LLM.
The technique to look for is format-preserving anonymisation: the placeholder passes the same format checks the original would have (valid IBAN checksum, valid Luhn-checksummed card number, valid-looking email at example.com). This preserves the model's ability to reason about the data structure while removing the privacy/security exposure. Simple [REDACTED]-style masking breaks LLM reasoning and produces unusable answers.
Why it matters
- ✓Anonymisation lets you preserve productivity (the user still gets an answer) while removing data exposure (the model provider never sees the real value).
- ✓Format-preserving placeholders avoid the "useless answer" failure mode of simple redaction.
- ✓GDPR Article 4(5) distinguishes anonymisation (irreversible, out of GDPR scope) from pseudonymisation (reversible, still in scope) — the technical implementation matters legally.