Privacy Data Transformation

Privacy Data Transformation is CustodianAI's de-identification API. It detects and removes sensitive information from text, documents, and images — before that content reaches an LLM, gets stored, or is shared downstream.

Two approaches in one API

Standard compliance modes

Rules and NER-based detection for common PII categories: names, emails, phone numbers, addresses, dates, and IDs. Outputs mapped to regulatory frameworks.

Mode	Use for
`MASKED`	General-purpose redaction with `****`
`PROPRIETARY`	Guardian Layer — domain-aware semantic de-identification

GDPR, HIPAA, and CUSTOM modes are coming soon. → See what's planned

→ Compliance Modes

Guardian Layer

CustodianAI's proprietary detection layer, identifying domain-specific sensitive content beyond standard PII — proprietary terminology, internal identifiers, and contextually sensitive language that compliance rules don't cover.

→ Guardian Layer

What you can de-identify

Text — any string, from a single sentence to a full document
CSV files — every cell is processed independently
DOCX files — text content replaced in-place, formatting preserved
PDF files — word-level redaction, layout preserved
TXT files — plain-text in, plain-text out
Images — OCR-based detection and redaction (pending public release)

→ File De-Identification

Authentication

All Privacy Data Transformation endpoints require your API key in the X-API-Key header:

X-API-Key: cai_your_key_here

→ API Reference: Authentication

Character credits

Each request consumes credits equal to the number of characters in the input text. File endpoints count the total characters across all processed cells or pages.

→ Character Credits

Two approaches in one API​

Standard compliance modes​

Guardian Layer​

What you can de-identify​

Authentication​

Character credits​