Skip to main content

Release notes

Learn about the changes, additions, and fixes in AI Smart Redact.

Version 1.0.0

05 May 2026

Added

  • AI Smart Redact detects and redacts Personally Identifiable Information (PII) from PDF documents. Send PDFs through the REST API or upload them through the Human-in-the-Loop (HITL) web interface. Both workflows return detection results with entity locations, and you can apply redactions to create sanitized documents. The HITL interface lets reviewers inspect detected PII before applying redactions.

  • Deterministic detection through two methods:

    • Pattern detection uses compiled regular expressions for structured formats like credit cards, IBANs, and email addresses. Patterns include checksum validation (Luhn, ISO 7064) where applicable.
    • Keyword detection uses the Aho-Corasick algorithm for fast matching of known sensitive terms from configurable denylists. You can also configure allowlists to suppress false positives for specific terms.
  • Semantic detection through an AI model that identifies context-dependent entities like person names, organizations, and physical addresses. You can configure entity labels at runtime through detection configurations.

  • 36 built-in entity labels:

    • Deterministic (pattern detection):
      • Financial: CREDIT_CARD, IBAN, BIC_SWIFT, MONEY, BARCODE, ISIN, LEI, VAT_NUMBER
      • Identification: UNIQUE_IDENTIFIER, VIN, NUMERIC_ID, ALPHANUMERIC_CODE, CURRENCY_CODE
      • Contact: EMAIL_ADDRESS, PHONE_NUMBER, DOMAIN_NAME, URL
      • Network: IP_ADDRESS, MAC_ADDRESS, HTTP_COOKIE
      • Location: GPS_COORDINATE, FILE_PATH
      • Temporal: DATE, DATETIME, TIME, DURATION
      • Numbers: DECIMAL_NUMBER, INTEGER_NUMBER, SCIENTIFIC_NUMBER, PERCENTAGE
      • Social: MENTION, HASHTAG
    • Semantic (AI-powered, default configuration): PERSON, ORGANISATION, PHYSICAL_ADDRESS, USERNAME
  • Context-aware confidence boosting. When context words appear near pattern matches, the detector increases confidence scores. For example, if “email” appears near an email address, confidence increases. Context words are available in seven languages: English, German, French, Italian, Spanish, Portuguese, and Dutch.

  • Detection configurations let you create and manage detection profiles that control score thresholds, enabled entity labels, languages, custom pattern recognizers, keyword lists, and exclusion rules. A starter template provides recommended defaults for new configurations.

    Screenshot of the AI Smart Redact detection configuration editor.
  • REST API for job processing. Upload PDF files, start detection or redaction jobs, and poll for results. The API uses asynchronous job processing, so you can send multiple documents and retrieve results when processing completes.

  • HITL web interface for interactive document review and redaction. Upload PDFs, view detected entities with confidence scores in an integrated PDF viewer, and accept or reject individual redaction suggestions before generating the final redacted document. Administrators can manage users and create detection configurations with a live testing panel to preview results against sample documents.

    Screenshot of the AI Smart Redact HITL interface.
  • User management and authentication. The HITL interface includes role-based access control. Administrators can create users, reset passwords, and manage account activation.

  • API key authentication for programmatic access to upload files, check job statuses, and download redacted documents. Keys support expiration dates.

  • Audit logging. All authentication events, user management actions, configuration changes, and file operations (upload, redaction, and download) are recorded in an append-only audit trail for compliance and troubleshooting.

  • Encryption at rest. Uploaded files are encrypted with AES-256-GCM using configurable keys. The service supports key rotation and Data Encryption Key (DEK) token caching with Redis.

  • Docker deployment. Deploy AI Smart Redact as Docker containers using the Docker Compose configurations from the samples repository. The service supports both CPU and GPU deployments; GPU deployment uses NVIDIA CUDA and provides faster inference for semantic detection. For deployment examples and API usage samples, refer to the AI Smart Redact samples repository.

  • OpenTelemetry observability. The service exports traces, metrics, and structured logs through OpenTelemetry. Connect to Grafana LGTM, Seq, Jaeger, or any OpenTelemetry-compatible backend for monitoring and diagnostics.