File Processing API
Upload PDF, DOCX, and XLSX files with automatic PHI detection and 24-hour retention
POST /api/files/upload
Upload a file for AI processing. Files are automatically scanned for PHI and retained for 24 hours before automatic deletion.
Supported Formats
- PDF (text-based, not scanned images)
- DOCX (Microsoft Word 2007+)
- XLSX (Microsoft Excel 2007+)
Max file size: 10 MB
Request Format
Use multipart/form-data encoding:
| Field | Type | Required | Description |
|---|---|---|---|
file | binary | Yes | The file to upload (PDF, DOCX, or XLSX) |
Example Request
// JavaScript
const formData = new FormData();
formData.append('file', fileInput.files[0]);
const response = await fetch('/api/files/upload', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY'
// Note: Do NOT set Content-Type - browser sets it automatically
},
body: formData
});
const result = await response.json();# cURL
curl -X POST /api/files/upload \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file=@/path/to/document.pdf"Response (200 OK)
| Field | Type | Description |
|---|---|---|
fileHash | string | SHA-256 hash of the file (use in /api/ai/chat requests) |
fileName | string | Original filename |
fileSize | number | File size in bytes |
mimeType | string | MIME type (application/pdf, etc.) |
phiDetected | boolean | Whether PHI was detected in the file content |
phiEntities | array | List of detected PHI entities with type and confidence |
uploadedAt | string | ISO-8601 timestamp of upload |
expiresAt | string | ISO-8601 timestamp when file will be deleted (24 hours) |
Example Response
{
"fileHash": "sha256_f4d5e6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b2c3d4e5",
"fileName": "contract.pdf",
"fileSize": 245678,
"mimeType": "application/pdf",
"phiDetected": false,
"phiEntities": [],
"uploadedAt": "2024-01-15T10:30:00Z",
"expiresAt": "2024-01-16T10:30:00Z"
}Response with PHI Detection
{
"fileHash": "sha256_abc123def456...",
"fileName": "employee_roster.xlsx",
"fileSize": 89234,
"mimeType": "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
"phiDetected": true,
"phiEntities": [
{
"type": "PERSON",
"text": "Jane Doe",
"score": 0.95
},
{
"type": "US_SSN",
"text": "123-45-6789",
"score": 1.0
},
{
"type": "PHONE_NUMBER",
"text": "(555) 123-4567",
"score": 0.98
}
],
"uploadedAt": "2024-01-15T11:00:00Z",
"expiresAt": "2024-01-16T11:00:00Z"
}⚠️ Important: PHI detection during upload is informational only. Policy enforcement happens when the file is processed via /api/ai/chat.
Error Responses
400 Bad Request - Unsupported File Type
{ "error": "Unsupported file type. Use PDF, DOCX, or XLSX" }400 Bad Request - Text Extraction Failed
{ "error": "Could not extract text from PDF. Ensure file is not image-only" }413 Payload Too Large
{ "error": "File size exceeds 10 MB limit" }401 Unauthorized
{ "error": "Authentication required" }429 Too Many Requests
{ "error": "Rate limit exceeded. Max 50 uploads/minute" }GET /api/files/:fileHash
Retrieve metadata about a previously uploaded file.
GET /api/files/sha256_abc123...
Authorization: Bearer YOUR_API_KEYResponse: Same structure as upload response
Using Files with AI Tasks
After uploading a file, use the fileHash in your AI chat request:
// Step 1: Upload file
const uploadRes = await fetch('/api/files/upload', {
method: 'POST',
headers: { 'Authorization': 'Bearer YOUR_API_KEY' },
body: formData
});
const { fileHash } = await uploadRes.json();
// Step 2: Process with AI
const chatRes = await fetch('/api/ai/chat', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
conversationId: 'new',
taskId: 'contract-summarizer',
fileHash: fileHash
})
});File Retention Policy
24-Hour Auto-Deletion
All uploaded files are automatically deleted 24 hours after upload for data minimization and HIPAA compliance. The expiresAt timestamp in the response indicates when deletion will occur.
- Files are stored encrypted at rest in Azure Blob Storage
- File content is never logged or retained beyond 24 hours
- After expiration, fileHash lookups will return 404 Not Found
- Re-upload the file if needed after expiration