Expunct
A platform for detecting and redacting personally identifiable information (PII) in text, documents, images, video, and audio.
Expunct provides a unified API to scan and redact sensitive data across multiple formats and media types. It supports 27+ entity types organized into three categories:
- PII — Personally Identifiable Information (names, emails, SSNs, addresses, and more)
- PCI — Payment Card Industry data (credit cards, bank accounts, IBAN codes)
- PHI — Protected Health Information (medical licenses, national/religious/political groups)
Supported formats
| Category | Formats |
|---|---|
| Text | Plain text, JSON |
| Documents | PDF, DOCX |
| Images | PNG, JPG |
| Video | MP4 |
| Audio | WAV, MP3 |
| Cloud URIs | s3://, gs://, https:// |
Multi-language support
Expunct supports detection and redaction in multiple languages, including English (en) and Spanish (es), with more languages planned.