Getting Started

Expunct ships two API pillars. Pick the one you need first — you can come back for the other.

Pillar	Best for	Status	Shortest first-success path
Redaction	Sanitizing text, files, prompts, or LLM I/O	GA	Python or Node SDK, or `curl`
Document Intelligence	Parsing, extracting, or safe-parsing PDF/DOCX for AI	Beta — gated per tenant	Raw HTTP (`curl` / `httpx` / `fetch`)

Document Intelligence is in beta. Endpoints return 403 until the feature flag is enabled for your tenant. If you signed up today, the redaction path will work immediately; document-intelligence requires beta enablement (Starter by request, with approved Professional and Business tenants as the primary rollout tier).

1. Get an API key

Create an API key from the dashboard or via the API:


curl -X POST https://api.expunct.ai/api/v1/api-keys \
  -H "Content-Type: application/json" \
  -d '{"name": "my-first-key"}'

API keys use the format pk_live_... for production or pk_test_... for testing.

Path A — Redact text (works on every plan)

This is the fastest first success and works for every signed-up tenant.

Install an SDK

Python


pip install expunct

Redact text

Python


from expunct import Expunct
 
client = Expunct(api_key="pk_live_...")
result = client.redact.text(
    text="John Smith's email is john@example.com and SSN is 123-45-6789",
)
print(result.redacted_text)
# "[PERSON]'s email is [EMAIL_ADDRESS] and SSN is [US_SSN]"

Node.js


import { Expunct } from '@expunct/sdk';
 
const client = new Expunct({ apiKey: 'pk_live_...' });
const result = await client.redact.text({
  text: "John Smith's email is john@example.com and SSN is 123-45-6789",
});
console.log(result.redactedText);
// "[PERSON]'s email is [EMAIL_ADDRESS] and SSN is [US_SSN]"

cURL


curl -X POST https://api.expunct.ai/api/v1/redact \
  -H "X-API-Key: pk_live_..." \
  -H "Content-Type: application/json" \
  -d '{"text": "John Smith'\''s email is john@example.com"}'

The response includes the redacted text and a list of findings — see Redaction for the full schema.

Path B — Document Intelligence (beta)

The truthful first-success path for document intelligence today is raw HTTP. Do not assume the published Python SDK, Node SDK, CLI, or MCP packages expose document-intelligence operations yet. Use curl, httpx, or fetch until package support is explicitly published and documented. Status is tracked on the Document Intelligence page.

Document Intelligence has three operations on PDF and DOCX:

Operation	Endpoint	What it returns	Use when
Parse	`POST /api/v1/parse`	Canonical structure + markdown + chunks	You need RAG-ready text and structure
Extract	`POST /api/v1/extract`	JSON matching your schema (or a built-in `template_id`)	You need specific fields (invoice totals, dates, names)
Safe-Parse	`POST /api/v1/workflows/safe-parse`	Sanitized canonical + markdown + chunks (no PII)	You need parse output that is safe to embed, store, or send to a third-party LLM

safe_parse is parse + sanitize as one workflow — not a separate parser. Use it when the document is sensitive and you want only sanitized artifacts persisted.

First success — submit a safe-parse job

cURL


# 1. Submit
curl -X POST https://api.expunct.ai/api/v1/workflows/safe-parse \
  -H "X-API-Key: pk_live_..." \
  -F "file=@document.pdf" \
  -F "language=en"
 
# Response: { "id": "7a8b...", "status": "queued", "workflow_kind": "safe_parse", ... }
 
# 2. Poll
curl https://api.expunct.ai/api/v1/documents/jobs/7a8b... \
  -H "X-API-Key: pk_live_..."
 
# 3. Once status == "completed", read an artifact
curl https://api.expunct.ai/api/v1/documents/<artifact_id>/content \
  -H "X-API-Key: pk_live_..."

Python (httpx)


import time, httpx
 
api_key = "pk_live_..."
headers = {"X-API-Key": api_key}
 
with open("document.pdf", "rb") as f:
    r = httpx.post(
        "https://api.expunct.ai/api/v1/workflows/safe-parse",
        headers=headers,
        files={"file": ("document.pdf", f, "application/pdf")},
        data={"language": "en"},
    )
job = r.json()
 
while True:
    job = httpx.get(
        f"https://api.expunct.ai/api/v1/documents/jobs/{job['id']}",
        headers=headers,
    ).json()
    if job["status"] in ("completed", "failed"):
        break
    time.sleep(2)
 
for art in job["artifacts"]:
    print(art["artifact_kind"], art["id"])

Node.js (fetch)


import fs from 'node:fs';
 
const apiKey = 'pk_live_...';
const headers = { 'X-API-Key': apiKey };
 
const form = new FormData();
form.append('file', new Blob([fs.readFileSync('document.pdf')]), 'document.pdf');
form.append('language', 'en');
 
let job = await fetch(
  'https://api.expunct.ai/api/v1/workflows/safe-parse',
  { method: 'POST', headers, body: form },
).then((r) => r.json());
 
while (job.status !== 'completed' && job.status !== 'failed') {
  await new Promise((r) => setTimeout(r, 2000));
  job = await fetch(
    `https://api.expunct.ai/api/v1/documents/jobs/${job.id}`,
    { headers },
  ).then((r) => r.json());
}
 
for (const art of job.artifacts) console.log(art.artifact_kind, art.id);

If the first call returns 403, your tenant does not yet have document_safe_parse_workflow enabled. Contact support to enable Document Intelligence beta for your account.

See the full reference: Parse, Extract, Safe Parse.

Next steps

All users
- Workflows — file, batch, policy-based redaction recipes
- Entity Types — what the redaction engine detects
- API Reference — every endpoint
Redaction
- Python SDK and Node.js SDK — published and ready
- LangChain integration — drop-in PII redaction middleware
- MCP server — redaction tools for Claude Code, Claude Desktop, and other MCP clients
Document Intelligence (beta)
- Document Intelligence overview
- Safe-parse for RAG ingestion