Skip to main content

Detection vs. De-identification

Privacy Data Transformation separates the two steps of privacy protection into distinct operations. Understanding the difference helps you choose the right endpoint for your use case.

The two steps

StepWhat it doesEndpoint
DetectionIdentify which terms are sensitive; return a list with metadataPOST /api/v1/analyze/text/proprietary
De-identificationTransform or remove those terms in the output textPOST /api/v1/deidentify/text and Guardian Layer endpoints

When to use detection first

Most integrations go straight to de-identification. Detection-first workflows are useful when you need to:

  • Audit before redacting — inspect what will be masked before committing to an irreversible operation
  • Build a review UI — show a human the list of detected entities for approval before processing
  • Populate a custom replacement map — use the detected terms to construct a CUSTOM mode replacement dictionary
  • Log entity metadata separately — record what types of entities appeared without storing the original values

Two-step example

import requests

headers = {"X-API-Key": "cai_your_key_here"}
text = "Dr. Emily Chen reviewed the patient file for trial NX-4401."

# Step 1: detect
analysis = requests.post(
"https://api.custodianai.com/api/v1/analyze/text/proprietary",
headers=headers,
json={"text": text, "domain": "Medical"},
).json()

print(analysis["sensitive_words"])
# e.g. ["Emily Chen", "NX-4401"]

# Step 2: de-identify using the detected terms as a custom map
replacements = {word: "****" for word in analysis["sensitive_words"]}

result = requests.post(
"https://api.custodianai.com/api/v1/deidentify/text",
headers=headers,
json={
"text": text,
"compliance_mode": "CUSTOM",
"replacements": replacements,
},
).json()

print(result["output_text"])
# "Dr. **** reviewed the patient file for trial ****."

One-step de-identification

For most cases, go directly to a de-identification endpoint. Privacy Data Transformation runs detection internally and applies the output in a single call:

result = requests.post(
"https://api.custodianai.com/api/v1/deidentify/text",
headers=headers,
json={
"text": text,
"compliance_mode": "GDPR",
},
).json()

Compliance Modes · Guardian Layer