How We Built On-Device De-Identification So AI Never Sees Real Names
Most AI privacy is a policy. Ours is architecture. We run a named entity recognition model inside the browser to strip identifying information before it ever leaves the device. Here is how it works, what we tested, and where it applies.
An executive coach types session notes into a platform. The notes contain names, company details, family circumstances, health disclosures, financial figures, psychometric scores. The coach clicks “Generate Brief” and receives a pre-session coaching analysis that references every relevant pattern across months of sessions.
At no point did any of that identifying information leave the coach’s device.
The AI that generated the brief never saw a real name. It saw tokens: PHI_4fce, PHI_b2e1, PHI_c9d4. It analysed the relationships between facts, tracked patterns across sessions, identified avoidance behaviours and developmental arcs — all without knowing who any of it was about. The real names were swapped back in after the AI responded, inside the browser, before the coach saw anything.
This is not a privacy policy. It is a privacy architecture. And as far as we can determine, running a named entity recognition model inside the browser for real-time de-identification before any AI call has not been done before at this level.
Why most AI privacy is not privacy
When a business uses an AI tool with sensitive data, the standard approach is one of three things:
Policy-based privacy. The provider promises not to train on your data, not to retain it, not to share it. This is a contractual assurance. It depends on the provider honouring the contract, not being compelled by law enforcement, not suffering a breach, and not changing their terms.
Server-side de-identification. The data is sent to a server where identifying information is stripped before being passed to the AI. This is better than nothing, but the server sees the data in plaintext. The de-identification happens after the data has already left your environment.
Trusted execution environments. The data is processed inside hardware-secured enclaves where even the infrastructure provider cannot access it. This is strong protection, but it depends on trusting the hardware vendor’s attestation and adds infrastructure complexity.
Each of these has legitimate uses. But none of them solve the fundamental problem: the data leaves the user’s device in identifiable form.
What we built instead
We run the de-identification inside the browser. The identifying information is stripped before the data ever leaves the device. The platform, the server, the AI provider — none of them ever see real names.
The system has three layers. Each one catches a different type of identifying information, and they run in sequence.
Layer 1: Regex — structured identifiers
The first pass catches data that follows predictable patterns: phone numbers (Australian and international formats), email addresses, street addresses, dates of birth, government identifiers (Medicare, ABN, SSN, NHS numbers), and LinkedIn URLs. The pattern library was originally built for Australian data and has since been expanded to cover international formats for the CoachIQ platform. These are deterministic — a phone number looks like a phone number — so pattern matching is fast and reliable.
Every match is replaced with a deterministic token. The same input always produces the same token, which means “Geoff Hartley” is always PHI_4fce across every session, every document, every report. This consistency matters; without it, the AI would see what looks like a different person in each session and lose the ability to track patterns over time.
Layer 2: NER — natural language entities
Regex catches structured data, but it cannot catch a name mentioned in a sentence. “Geoff mentioned he’s been avoiding the conversation with Dave about the CFO hire” contains two names that no pattern can reliably match without understanding language.
This is where the named entity recognition model runs. We load a BERT-based NER model (Xenova/bert-base-NER) directly into the browser using Transformers.js. The model runs as WebAssembly — no server call, no API, no data leaving the device. It processes each paragraph of text, identifies person names, organisation names, and location names, and adds them to the token map.
The model is small (under 50MB), loads once, and caches in the browser. It preloads in the background on login so there is no delay when the user first generates a report. Its only job is spotting proper nouns. It does not do clinical reasoning, sentiment analysis, or anything else. One model, one task, running locally.
We maintain a false-positives list — words the NER model frequently misclassifies as entities. In coaching data, terms like “coach”, “advisor”, “driver”, “achiever”, “commander”, “integrity” and “legacy” regularly trigger false matches because they appear as capitalised role descriptors or psychometric profile labels. The list currently contains 24 terms and grows as we encounter new domain-specific misclassifications.
Layer 3: Propagation — context sweep
After regex and NER have built the token map, the third layer sweeps the entire text for any remaining instances of values already in the map. If NER caught “Dave Mitchell” in one paragraph, the propagation pass ensures that “Dave” appearing alone three paragraphs later is also masked.
This handles the common pattern where a full name appears once and then the person is referred to by first name throughout. Without propagation, those subsequent references would leak through.
The sweep processes entries longest-first to prevent nested replacements, and uses word-boundary matching to avoid replacing “Lisa” inside “Lisabel” or similar substring collisions.
What the AI actually sees
When the coach requests a pre-session brief, the platform concatenates all session data for that member, runs the three-layer pipeline, and sends the masked data to the AI. A coaching session that reads:
Geoff mentioned he’s been avoiding the performance conversation with Dave Mitchell. Lisa thinks the CFO hire needs to happen before Q3. Revenue at Hartley Civil Engineering is tracking at $28.5M but EBITDA has dropped to 8.2%.
Becomes:
PHI_4fce mentioned he’s been avoiding the performance conversation with PHI_f3c7. PHI_d1f8 thinks the CFO hire needs to happen before Q3. Revenue at PHI_c9d4 is tracking at $28.5M but EBITDA has dropped to 8.2%.
The AI analyses the patterns, generates the brief, and returns text containing the same tokens. The browser then swaps the tokens back to real names before the coach sees the output. The coach’s experience is seamless — they see real names throughout and never interact with the masking system.
One line in the AI system prompt handles the model’s side: “PHI values in this data have been masked using deterministic token replacement (PHI_XXXX format). Return all PHI values as provided — do not substitute or invent replacements.”
What we tested
We ran a structured validation using realistic coaching session data spanning three sessions. The test document contained session notes, psychometric profiles (REACH assessment), values inventories, commitment tracking, group dynamics observations, financial snapshots, and personal context.
We generated the same pre-session coaching brief under two conditions: fully masked data and original unmasked data. The outputs were compared across five analytical dimensions:
- Pattern recognition across sessions
- Commitment tracking and follow-through analysis
- Session-over-session trajectory and score trends
- Psychometric integration (REACH profile mapping to coaching behaviour)
- Coaching question generation
The finding: analytical fidelity loss between the two outputs was minimal. The AI produced equivalent quality analysis with or without real names. It does not need identifying information to deliver intelligence. It needs the relationships between facts, not the labels.
The limitation we documented
The test also revealed something we chose to document openly rather than ignore.
Even with all names and identifiers stripped, the combination of contextual facts — industry, revenue band, staff count, family details, geographic indicators, psychometric scores — can create a unique fingerprint. Someone with domain knowledge could potentially narrow identification from context alone. A peer in the same industry and region, seeing a masked profile that matches someone they know, could make the connection.
This does not invalidate de-identification. It means masking is a strong layer of protection but not a guarantee of anonymity against a motivated, knowledgeable observer. This is consistent with the privacy literature on re-identification risk in rich datasets. We consider it important to be honest about this boundary rather than imply that de-identification equals anonymity.
For applications where even contextual fingerprinting is unacceptable, the architecture supports an additional layer: hardware-secured confidential computing via trusted execution environments. PHI masking combined with TEE processing means that even if the hardware enclave were compromised, the attacker gets only tokenised data. Defence in depth.
Where this applies beyond coaching
We built this for executive coaching. But the architecture is domain-agnostic. The three-layer pipeline works on any text where identifying information needs to be stripped before AI processing.
Clinical psychology and counselling. Session notes, treatment plans, progress notes. Therapists need AI to help with documentation but cannot send client names to an API. This is already running in production across two platforms: ConfideAI for mental health professionals and MycenAI for psychedelic-assisted therapy practitioners.
Legal. Case notes, client correspondence, litigation strategy documents. Legal privilege depends on confidentiality. A lawyer who sends client names to an AI provider has a privilege problem. On-device de-identification means the AI analyses the legal reasoning without ever knowing who the client is.
Accounting and financial services. Client financial data, tax structuring advice, AML/CTF documentation. Regulatory obligations around client confidentiality are explicit. De-identified data can be processed by AI without triggering data handling obligations that apply to identified data.
Any professional seeking AI assistance on a sensitive matter. A business owner who wants to use AI to think through a dispute, a medical situation, or a personal legal matter — without creating a record of that information on a third-party server.
The pattern is the same every time. Sensitive data exists locally. AI analysis would be valuable. The barrier is that sending the data to an AI provider creates a confidentiality risk. On-device de-identification removes that barrier architecturally, not contractually.
The architecture decision that makes this possible
The reason this works in the browser is a decision made early: the NER model’s only job is entity detection. It does not do any of the analytical work. It spots proper nouns and adds them to a map. That task is small enough for a compact model running as WebAssembly.
The analytical work — pattern recognition, trajectory analysis, coaching intelligence — is done by a large language model via API, but that model only ever sees tokens. The division of labour is clean: a small local model handles identification, a large remote model handles analysis, and the two never share real data.
This separation is what makes the architecture portable across domains. The NER model does not need to understand coaching, or law, or clinical psychology. It needs to understand that “Dave Mitchell” is a person and “Hartley Civil Engineering” is an organisation. The domain expertise lives in the prompts and templates, not in the de-identification layer.
For more on how we approach data handling across AI systems, see What Happens to Your Data When You Press ‘Send’ on an AI Tool. For the broader privacy architecture including hardware-secured enclaves, see Why AI Safety Features Are Load-Bearing Architecture.
If your practice handles sensitive data and you want AI analysis without the confidentiality risk, start with a conversation.