Definition
Format-preserving anonymisation
Format-preserving anonymisation is a substitution technique where the replacement value satisfies the same syntactic format as the original (valid IBAN checksum, valid Luhn card number, valid-looking email) — so downstream systems that validate format still accept the placeholder.
For an AI DLP, format preservation is what lets the LLM reason normally about the data structure ("this looks like a customer's IBAN, route it to the EU department") without ever seeing the real value. Without format preservation, generic [REDACTED] markers break LLM reasoning chains and produce useless answers — leading users to bypass the DLP entirely.
Why it matters
- ✓Bypass rate is the early-warning indicator: if users start retyping prompts to evade the DLP, anonymisation is too aggressive or too disruptive.
- ✓Format-preserving values are the technical key to keeping a Block / Anonymize mode rolled out for the long term.