PII redaction
Nebula passes work-defining document text (contracts, SOWs, scopes,
briefs, specs, MSAs, change orders, RFPs, etc.) to an LLM for
extraction. To keep PII out of the LLM context, every prompt goes
through a redaction layer before it leaves the server.
What's redacted
- Email addresses
- Australian phone numbers
- ABN, ACN
- Medicare numbers
- TFN
- Driver's license numbers
The patterns are in src/lib/ai-redaction.ts. Each match is
replaced with a stable placeholder; the unredact step restores the
original after the LLM returns.
What's not redacted
Names of organisations, project names, dates, money amounts. These
are needed for the engines to do their job and are not regulated PII
in the documents Nebula handles.
Why this matters
Anthropic's terms forbid sending PII without specific approval, and
defence-classified projects mandate it. The redaction layer is the
single boundary; bypassing it must trip the gate in the CI grep
check.
Cross links
/docs/security/audit: how access is logged/docs/compliance/privacy: right-to-erasure flow