Safe AI Workbench Developer Docs

Workflow Builder

Visual workflow designer for creating custom PDF processing pipelines

Overview

The Workflow Builder provides a visual drag-and-drop interface for creating custom document processing workflows. Build Tier 1 (de-identified) or Tier 2 (BAA-covered) workflows by connecting specialized PDF processing nodes.

Key Features:

  • Drag-and-drop node placement on canvas
  • Visual connection of workflow steps
  • Real-time configuration of node parameters
  • Workflow validation before deployment
  • Execution history and debugging

Accessing the Workflow Builder

1

Navigate to Workflows

Go to System AdminWorkflows from the main navigation

2

View All Workflows

The Workflows page displays all existing workflows with their status, category, and last updated date

3

Create New Workflow

Click + New Workflow button (top-right) to open the visual workflow builder

4

Edit Existing Workflow

Click the Edit button next to any workflow to modify it in the builder

PDF Workflow Nodes

The PDF Workflow section in the task library contains specialized nodes for document processing:

📄

PDF Input

Accept PDF file upload and extract text content for downstream processing.

Inputs: PDF file (user upload)

Outputs: Extracted text, page count, file metadata

Configuration: OCR options, page range selection

🛡️

PHI Detection

Scan document text for Protected Health Information using AI-powered entity recognition.

Inputs: Document text (from PDF Input or other source)

Outputs: PHI entities (type, value, confidence), entity count, entity types

Configuration: Confidence threshold, entity types to detect

🔒

Smart Redaction

Redact PHI with tokenization for reversibility. Preserves clinical terms for accurate AI processing.

Inputs: Document text, PHI entities, workflow execution ID

Outputs: Redacted text, token map ID, redaction count

Configuration: Strategy (full/selective/preserve_clinical), tokenization, date granularity

Output Validation

Scan output for residual PHI and validate data quality before transmission.

Inputs: Output data, allowed PHI fields, schema definition

Outputs: Validation passed/failed, residual PHI found, risk level, quality score

Configuration: PHI scan toggle, allowed fields, blocking behavior, JSON schema

📤

API Post

Securely transmit data to external API endpoints with authentication, TLS enforcement, and retry logic.

Inputs: Endpoint config ID, data payload, workflow execution ID, dry run flag

Outputs: Success/failure, HTTP status, response body, transmission ID, latency

Configuration: Endpoint selection, dry run mode

🔐

Compliance Gate

Validate BAA status and Tier 2 eligibility before processing identifiable PHI.

Inputs: Check types (baa_signed, baa_not_expired, tier2_enabled)

Outputs: Passed/failed, failure reasons, BAA status details

Configuration: Required checks, blocking behavior

Building a Workflow

Follow these steps to create a PDF processing workflow:

Step 1: Fill in Workflow Details

When the builder opens, you'll see a "Workflow Details" form at the top. Fill in the required information:

  • Workflow Name (required): Enter a descriptive name (e.g., "Prior Authorization - Tier 1")
  • Category: Select from Document Processing, Data Extraction, Compliance, Integration, or Custom
  • Description: Explain what this workflow does (optional but recommended)
  • Icon: Choose an emoji to represent the workflow (e.g., 📄, 🔒, ⚡)
Name: "Prior Authorization - Tier 1" ✓
Category: Document Processing
Description: "Process prior auth documents with PHI redaction"
Icon: 📄

Step 2: Drag Nodes to Canvas

From the task library on the left, drag PDF workflow nodes onto the canvas:

  • Drag "PHI Detection" node first
  • Add "Smart Redaction" below it
  • Add "Output Validation" next
  • Optionally add "API Post" for transmission

Step 3: Connect Nodes

Click and drag from output handle (right side) to input handle (left side) of next node:

PDF Input → PHI Detection → Smart Redaction → Output Validation → API Post

Connections show data flow direction and execution order. Always start with PDF Input for document workflows.

Step 4: Configure Nodes

Click on a node in the canvas to configure it in the right panel:

  • Step Details: Shows the selected node's name and type
  • Variable Mapper: Configure where each input comes from (User Input, Previous Step, Static Value, or System Variable)
  • Map required inputs first (marked with *) - these must be configured for the workflow to be valid
  • Optional inputs can be left unmapped or set to default values

Step 5: Define Input Mappings

For each node parameter, choose the data source:

  • User Input: Data provided when workflow executes (e.g., documentText)
  • Step Output: Output from a previous node (e.g., "PHI Detection" → entities)
  • Static Value: Hardcoded value you enter (e.g., confidenceThreshold: 0.8)
  • System Variable: Auto-generated by the system (e.g., execution_id, timestamp)

Step 6: Save Workflow

When you're done configuring:

  • Click the Save Workflow button in the top-right corner
  • The workflow will be validated - fix any errors shown (missing name, incomplete node configurations)
  • On successful save, you'll be redirected to the Workflows list
  • From the list, you can Edit, Run, Duplicate, or Delete the workflow

💡 Tip: Use the "Test Workflow" button to dry-run the workflow before saving to catch configuration issues early.

Example: Tier 1 Workflow

A complete Tier 1 workflow that de-identifies documents before processing:

1

PDF Input

Input: pdfFile (user upload) → Output: extractedText, pageCount

2

PHI Detection

Input: extractedText (step 1) → Output: entities

3

Smart Redaction

Input: extractedText (step 1), entities (step 2), execution_id (system)

Config: strategy=preserve_clinical, tokenize=true

4

Output Validation

Input: redactedText (step 3), allowedPHIFields=[] (empty), scanPHI=true

Blocks if any PHI detected in output

5

API Post

Input: validatedData (step 4), endpoint="FHIR Server", dryRun=false

Transmit de-identified data to external API

Example: Tier 2 Workflow

A Tier 2 workflow that preserves PHI under BAA coverage:

1

Compliance Gate

Checks: baa_signed, baa_not_expired, tier2_enabled

Blocks workflow if BAA not valid

2

PDF Input

Input: pdfFile (user upload) → Output: extractedText

3

PHI Detection

Input: extractedText (step 2) → Detects PHI but does NOT redact (for monitoring only)

4

Output Validation

Input: extractedText (step 2), allowedPHIFields=["patientName", "patientId", "dateOfBirth"]

Allows PHI in designated fields, blocks unexpected PHI

5

API Post

Transmit identifiable PHI to BAA-covered endpoint

Full audit logging with payload hash

Deploying Workflows

Test workflows with sample data before deploying to production:

Dry Run Mode

Enable dry run in API Post nodes to simulate transmission without actually sending data.

Test Execution

Use "Execute Workflow" button with sample document text to validate each step.

Review Step Outputs

Examine output from each node to verify data transformations are correct.

Check Audit Logs

View execution history to troubleshoot failures and monitor performance.

Best Practices

Always include output validation

Critical safety net to prevent PHI leakage

Use descriptive node names

Helps team members understand workflow purpose

Test with sample data first

Validate workflow before deploying to production

Document workflow purpose

Add detailed description explaining use case and data flow

Use compliance gates for Tier 2

Automatically enforce BAA requirements

Monitor execution history

Review for patterns in failures or performance issues