POST /api/v1/deidentify/text/proprietary/outputs/pdf
De-identifies a PDF file word by word using Guardian Layer, preserving page layout. Sensitive words are replaced using PDF redaction annotations. Returns a de-identified PDF.
note
This endpoint processes PDFs with a text layer. Scanned PDFs without selectable text will not have content de-identified. For image-based documents, use the image endpoint instead.
Request
POST /api/v1/deidentify/text/proprietary/outputs/pdf
X-API-Key: cai_your_key_here
Content-Type: multipart/form-data
Form fields
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
file | file | Yes | — | A .pdf file |
domain | string | No | General | General, Medical, Finance, or Custom |
masking_type | string | No | transform | redact fills changed words with a black rectangle; transform replaces with alternative text |
pii_entities | string | No | all | Comma-separated entity types. Leave empty for all |
Response
Returns application/pdf. The filename is prefixed with deid_.
Example
import requests
with open("contract.pdf", "rb") as f:
response = requests.post(
"https://api.custodianai.com/api/v1/deidentify/text/proprietary/outputs/pdf",
headers={"X-API-Key": "cai_your_key_here"},
files={"file": ("contract.pdf", f, "application/pdf")},
data={
"domain": "Finance",
"masking_type": "redact",
"pii_entities": "PERSON,ID_NUMBER",
},
)
with open("deid_contract.pdf", "wb") as out:
out.write(response.content)
Error responses
| Status | Description |
|---|---|
400 | File is not a .pdf or is malformed |
401 | Missing or invalid API key |
403 | Key expired or character limit reached |
500 | PDF processing library unavailable |