Safe AI Workbench Developer DocsAI-powered workspace with PHI protectionHIPAA-compliant API

Policy Management Guide

Configure custom policies to control how sensitive content is handled before AI processing

What are Policies?

Policies are rules that automatically enforce safety controls on AI requests. Each policy can:

Detect sensitive patterns - Use rules or keyword matching to find content
Apply actions - Warn, block, redact, or allow content based on severity
Customize by group - Different policies for different teams or departments
Audit compliance - All policy triggers are logged for review

Policy Actions

Warn

Allow the request but flag the policy violation in the response. Use for monitoring without blocking.

Block

Reject the request immediately. AI processing does not occur. Use for critical violations.

Redact

Replace sensitive content with [REDACTED] before AI processing. Balances safety and functionality.

Allow

Explicitly permit content even if it matches other patterns. Use for exceptions and overrides.

Creating a Policy

Use the Admin Dashboard or API to create policies:

POST /api/admin/policies
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

{
  "name": "Block Social Security Numbers",
  "description": "Prevent SSNs from being sent to AI models",
  "pattern": "\\b\\d{3}-\\d{2}-\\d{4}\\b",
  "isRegex": true,
  "action": "block",
  "enabled": true,
  "groupId": null  // null = applies to all groups
}

💡 Tip: Start with warn policies to monitor patterns before blocking, to avoid false positives.

Example Policies

Block: Credit Card Numbers

{
  "name": "Block Credit Cards",
  "pattern": "\\b(?:\\d{4}[- ]?){3}\\d{4}\\b",
  "isRegex": true,
  "action": "block"
}

Redact: Patient Names

{
  "name": "Redact Patient Names",
  "pattern": "patient\\s+(?:name|id):\\s*([A-Za-z ]+)",
  "isRegex": true,
  "action": "redact"
}

Warn: Profanity

{
  "name": "Warn on Profanity",
  "pattern": "\\b(damn|hell|crap)\\b",
  "isRegex": true,
  "action": "warn"
}

Allow: De-identified Data

{
  "name": "Allow De-identified IDs",
  "pattern": "PATIENT-[0-9]{6}",
  "isRegex": true,
  "action": "allow"
}

Policy Evaluation Order

Policies are evaluated in this order:

Allow Policies First

If any allow policy matches, skip remaining checks

Block Policies Next

If any block policy matches, reject immediately

Redact Policies

Apply all matching redaction rules to content

Warn Policies Last

Flag violations in response but allow processing

Response with Policy Violations

When policies are triggered, the response includes violation details:

{
  "completion": null,  // null when blocked
  "policyViolations": [
    {
      "policyId": "pol_abc123",
      "policyName": "Block Social Security Numbers",
      "action": "block",
      "pattern": "\\b\\d{3}-\\d{2}-\\d{4}\\b",
      "matches": ["123-45-6789"],
      "message": "Request blocked due to SSN detection"
    }
  ],
  "conversationId": "conv_xyz789"
}

Best Practices

Start with warn policies

Monitor patterns before blocking to avoid false positives

Use specific detection rules

Overly broad rules cause false positives and user frustration

Document policy intent

Clear descriptions help administrators understand and maintain policies

Review audit logs regularly

Analytics show which policies are triggering and how often

Next Steps

Policies API Reference →

Complete API documentation for policy CRUD operations

PHI Detection Guide →

Learn about policy-driven PHI detection