Stirling-PDF Web GUI: Edit, Merge, and Sign PDF Files Locally via Docker

Run your own local PDF editing suite. Deploy Stirling-PDF using Docker Compose to merge, split, OCR, and convert files securely.

Stirling-PDF Web GUI: Edit, Merge, and Sign PDF Files Locally via Docker

Stirling-PDF relies on a suite of underlying open-source rendering and conversion utilities to execute complex operations. These dependencies are integrated directly into the Docker image:

  1. PDF Rendering and Manipulation:
  2. Apache PDFBox: A Java library used for split, merge, encrypt, decrypt, and form-filling operations.
  3. OpenPDF: A fork of iText, handling low-level page extraction, metadata injection, and overlay rendering.
  4. Document Conversion:
  5. LibreOffice (soffice): Required for headless conversion of Office formats (DOCX, XLSX, PPTX) to PDF. The standard (latest) Docker image contains a minimal LibreOffice installation, while the fat image (latest-fat) includes the complete suite along with additional system fonts to minimize formatting discrepancies.
  6. WeasyPrint: A Python-based document rendering engine that compiles HTML and CSS into print-ready PDFs.
  7. Optical Character Recognition (OCR):
  8. Tesseract OCR: The primary OCR engine. By default, the Stirling-PDF image is bundled with the English language trained data (eng.traineddata).
  9. OCRmyPDF: A specialized Python wrapper built on top of Tesseract and PyPDF4. It optimizes the OCR pipeline by generating a searchable text layer and embedding it into the original PDF file without rasterizing existing page graphics.

Advanced OCR Capabilities and Language Pack Integration

The OCR engine in Stirling-PDF processes scanned pages and constructs searchable PDF documents. Persisting language files and adding new language packs requires configuring host-mounted volumes for Tesseract's training data.

Persistent Tessdata Directory

To avoid downloading OCR language training files on every container restart, mount a host directory to the container's Tesseract path /usr/share/tessdata. When the container initializes, Stirling-PDF downloads missing language files specified in the LANGS environment variable to this location.

Specifying OCR Languages

Configure the languages you wish to load by providing ISO 639-2 codes to the LANGS environment variable. Multiple languages must be comma-separated:

environment:
  - LANGS=eng,deu,fra,spa

During initialization, Stirling-PDF matches these codes with files like eng.traineddata, deu.traineddata, etc., downloading them from the official Tesseract repositories if they are not already present in the mounted volume.

Security, Permissions, and Access Control

Deploying a web-based PDF utility on a public VPS requires implementing strict access controls, running containers with reduced privileges, and enabling user authentication.

Running with Non-Root Permissions

Running Docker containers as root poses a risk if a container breakout vulnerability occurs. Stirling-PDF supports mapping file access to a specific host user using PUID and PGID environment variables.

To identify the correct values, run the id command on your host:

id $USER

Use the returned uid (for PUID) and gid (for PGID) in your compose configuration. This ensures that any files created or modified by the container (configs, logs, custom files) match the owner of the host directory, preventing permission conflicts.

User Authentication and Security Group Activation

By default, Stirling-PDF runs in a public mode without user authentication. To restrict access to authorized users, you must enable authentication:

  1. Set DISABLE_ADDITIONAL_FEATURES=false to ensure advanced options are compiled at runtime.
  2. Set SECURITY_ENABLELOGIN=true to redirect unauthenticated requests to a login screen.
  3. Define initial admin credentials using SECURITY_INITIALLOGIN_USERNAME and SECURITY_INITIALLOGIN_PASSWORD.

Upon first login, Stirling-PDF forces a password change. These user credentials and sessions are stored in an H2 database file inside the /configs directory, which must be persisted on the host system.

Production-Ready Docker Compose Configuration

Create a file named docker-compose.yml on your VPS. The following configuration defines the Stirling-PDF service, sets memory limits, configures user permissions, and exposes the application to a local port.

version: '3.8'

services:
  stirling-pdf:
    image: stirlingtools/stirling-pdf:latest
    container_name: stirling-pdf
    restart: unless-stopped
    ports:
      - "127.0.0.1:8080:8080"
    security_opt:
      - no-new-privileges:true
    deploy:
      resources:
        limits:
          memory: 2G
    environment:
      # Permissions
      - PUID=1000
      - PGID=1000
      - UMASK=022

      # Core Configuration
      - DOCKER_ENABLE_SECURITY=true
      - DISABLE_ADDITIONAL_FEATURES=false
      - SECURITY_ENABLELOGIN=true
      - SECURITY_INITIALLOGIN_USERNAME=admin
      - SECURITY_INITIALLOGIN_PASSWORD=stirling_admin_pass_change_me

      # OCR and Localization
      - LANGS=eng,deu,fra
      - SYSTEM_DEFAULTLOCALE=en-US

      # Performance and Constraints
      - SYSTEM_CONNECTIONTIMEOUTMINUTES=30
    volumes:
      - ./config:/configs
      - ./tessdata:/usr/share/tessdata
      - ./logs:/logs
      - ./customFiles:/customFiles

Note: The port binding 127.0.0.1:8080:8080 restricts direct access to the container from the public internet. This ensures that traffic must traverse your reverse proxy, enforcing SSL/TLS encryption.

Nginx Reverse Proxy & SSL/TLS Configuration

Serving Stirling-PDF over HTTPS requires a reverse proxy to terminate SSL connections. Additionally, because PDF operations involve large file uploads, you must configure Nginx to allow substantial payloads without returning a 413 Payload Too Large error.

Nginx Server Block

Create a new configuration file in your Nginx site-enabled directory (e.g., /etc/nginx/sites-available/pdf.example.com):

server {
    listen 80;
    listen [::]:80;
    server_name pdf.example.com;

    # Redirect all HTTP traffic to HTTPS
    return 301 https://$host$request_uri;
}

server {
    listen 443 ssl http2;
    listen [::]:443 ssl http2;
    server_name pdf.example.com;

    # SSL Certificates (Managed via Certbot/Let's Encrypt)
    ssl_certificate /etc/letsencrypt/live/pdf.example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/pdf.example.com/privkey.pem;

    # SSL Optimizations
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_prefer_server_ciphers on;
    ssl_ciphers 'ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384';
    ssl_session_cache shared:SSL:10m;
    ssl_session_timeout 1d;
    ssl_session_tickets off;

    # Security Headers
    add_header X-Frame-Options "SAMEORIGIN" always;
    add_header X-Content-Type-Options "nosniff" always;
    add_header X-XSS-Protection "1; mode=block" always;
    add_header Content-Security-Policy "default-src 'self'; script-src 'self' 'unsafe-inline' 'unsafe-eval'; style-src 'self' 'unsafe-inline'; img-src 'self' data:; font-src 'self' data:; connect-src 'self';" always;

    # Support Large PDF Uploads (Adjust based on your requirements)
    client_max_body_size 250M;

    location / {
        proxy_pass http://127.0.0.1:8080;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # Disable buffering for larger file transfers
        proxy_buffering off;
        proxy_request_buffering off;

        # Timeouts for heavy PDF operations
        proxy_read_timeout 600s;
        proxy_connect_timeout 600s;
        proxy_send_timeout 600s;
    }
}

Enable the configuration and reload Nginx:

ln -s /etc/nginx/sites-available/pdf.example.com /etc/nginx/sites-enabled/
nginx -t && systemctl reload nginx

Post-Deployment Verification & Administration

Once Stirling-PDF is deployed behind your reverse proxy, perform the following administrative tasks:

  1. Verify Services: Check that the docker container is running without errors by analyzing the startup logs: bash docker compose logs -f stirling-pdf
  2. First Login: Navigate to https://pdf.example.com and log in using the initial credentials defined in your compose file.
  3. Change Password: Go to the profile or user administration menu in the top right to change the admin password to a strong, unique secret.
  4. Maintenance & Updates: To pull the latest updates of Stirling-PDF, download the updated image and recreate the container: bash docker compose pull # Recreate container in detached mode docker compose up -d --remove-orphans