Architecture
AI Smart Redact is composed of three subsystems. The Manager stores files and runs detection or redaction jobs. The Orchestrator sits in front of the Manager to add authentication, user management, and the review workflow that powers the Human-in-the-Loop (HITL) web application. The Worker runs the detection pipeline and the AI model, and is internal to the Docker network.
Component diagram
In the default configuration, the Manager and Worker communicate through a REST API over HTTP. Each stateful subsystem has its own PostgreSQL database, and the Manager and Worker share a single file storage volume. The Manager and the Worker both read and write files directly to the shared storage. The Manager doesn’t proxy file I/O for the Worker.
The Orchestrator additionally uses a Redis instance for DEK-token caching, which is shipped with the default Docker Compose stack but isn’t shown in the diagram below for clarity.
Subsystems
| Subsystem | Purpose | Default port |
|---|---|---|
| Manager | File storage, detection and redaction jobs, persistence | 9982 |
| Orchestrator | Authentication, user management, HITL workflow | 9983 |
| Worker | Detection pipeline, AI model inference | 4885 (internal) |
Manager
The Manager owns file storage and the job lifecycle. Clients call the Manager directly for API integrations, or through the Orchestrator for browser-based workflows. The Manager persists job state in PostgreSQL and reads or writes PDFs and intermediate artifacts to the shared file storage volume.
Orchestrator
The Orchestrator wraps the Manager with JWT authentication, user accounts, and the review workflow that powers the HITL web application. It has its own PostgreSQL database for users, sessions, and HITL state. The HITL web application talks to the Orchestrator only.
The Orchestrator also uses a Redis instance as an optional cache for DEK tokens. Redis is included in the default Docker Compose stack but isn’t strictly required: if no Redis connection string is configured, the Orchestrator falls back to caching DEK tokens in memory.
Worker
The Worker accepts detection or redaction commands from the Manager (HTTP/REST in the default transport), reads the PDF directly from shared file storage, runs the detection pipeline (pattern matching, keyword matching, and AI model inference), and writes results back to file storage. The Worker port (4885) isn’t exposed outside the Docker network.
File access goes straight to the configured backend, which is either a shared local volume mounted on both Manager and Worker or an S3-compatible object store. The Worker doesn’t fetch or upload files through the Manager’s REST API.
Protocols
| Protocol | Where it’s used |
|---|---|
| HTTP | Clients to Manager or Orchestrator |
| HTTP/REST | Orchestrator to Manager, Manager to Worker (default transport) |
| SQL | Manager and Orchestrator to their PostgreSQL databases (over the Npgsql client) |
| file I/O | Manager and Worker each access shared storage directly (local volume or S3-compatible object store) |
| inference | Detection Pipeline to AI model |
Deployment variants
The component diagram shows the default REST transport. For higher throughput, AI Smart Redact also supports a RabbitMQ-based variant with multiple Workers behind shared queues, multiple Manager instances behind a load balancer, and external file storage on S3. Refer to Scale AI Smart Redact for these variants and tuning guidance.
Next steps
- Get started with a Docker Compose deployment.
- Configure AI Smart Redact for production settings.
- API reference for endpoint documentation across the three subsystems.