BFF and Security Architecture

Chapter 31 — Backend-for-Frontend and Security Architecture

Chapter 14 established RAPID AI’s three-service topology: a Python FastAPI engine for diagnostic intelligence, a SvelteKit Explorer for the frontend and BFF layer, and PostgreSQL with pgvector for persistence. Chapter 20 documented the API contract between them. Chapter 30 described how the agentic copilot reasons about diagnostics through tool calls. This chapter closes the architectural story by explaining how these services are secured, how the BFF pattern protects the diagnostic engine from public exposure, and how the platform deploys in two distinct modes to serve both end users and enterprise integrations.

Security in an industrial diagnostic platform is not an afterthought. A compromised RAPID AI instance could issue false diagnostic clearances, suppress real failure warnings, or leak proprietary IMS rule libraries. The architecture described here treats every network boundary as hostile and every input as untrusted.

31.1 The BFF Pattern

Why the Browser Never Talks to Python

The Python FastAPI backend is RAPID AI’s intellectual core. It contains 451+ physics-based diagnostic rules, the SEDL entropy engine, the AESF stability framework, the FRETTLSM causal taxonomy, Weibull RUL estimation, and the CDE contradiction engine. Exposing this directly to the public internet would create an unacceptable attack surface — not because FastAPI is insecure, but because an engineering intelligence engine should not also be responsible for session management, CSRF protection, rate limiting, locale formatting, and browser-specific concerns.

The SvelteKit server sits between the browser and the Python backend as a deliberate architectural boundary:

Browser (Svelte 5 client)
    |
    |  HTTPS, session cookie, CSRF token
    |
    v
SvelteKit Server (BFF Layer)       <-- PUBLIC INTERNET BOUNDARY
    |  Bun 1.3, port 5173
    |
    |  HTTP, service JWT (RS256)
    |  Internal network only
    v
Python FastAPI Backend             <-- PRIVATE NETWORK ONLY
    |  Python 3.13, port 8000
    |
    v
PostgreSQL 17 + pgvector           <-- PRIVATE NETWORK ONLY
    port 5432

The browser sends requests to SvelteKit server routes (+page.server.ts and +server.ts files). These routes authenticate the user, validate the request, call the Python backend over the internal network, transform the response into UI-ready data, and return it to the browser. The Python backend binds to 127.0.0.1 or a Docker internal network. It is unreachable from outside the server boundary in every deployment topology.

What the BFF Does

The BFF is not a thin reverse proxy. It is a purposeful transformation and security layer with six responsibilities:

Authentication and session management. The better-auth library runs on the SvelteKit server. It issues httpOnly cookies, manages OAuth flows, and stores sessions server-side. The Python backend never sees raw user credentials.
Authorization enforcement. Every server route checks the user’s role before calling the Python backend. An operator cannot access the rule management API. A manager cannot trigger a raw diagnostic pipeline run.
Request validation. Zod schemas validate every incoming request on the SvelteKit server before it reaches Python. Malformed requests are rejected at the boundary, not forwarded to the engine.
Data transformation. The Python backend returns engineering-correct output: raw confidence scores, AESF state codes, Weibull parameters, entropy vectors. The BFF transforms these into UI-ready data with labels, colors, icons, and role-appropriate detail levels.
API aggregation. A fleet dashboard requires data from every asset. The BFF calls multiple Python endpoints in parallel over the internal network and returns a single merged response to the browser, eliminating dozens of round trips.
Rate limiting and abuse protection. Per-user and per-endpoint rate limits are enforced at the SvelteKit layer. The Python backend is shielded from traffic spikes.

Contract Separation

The BFF creates two independent API contracts:

Internal contract (SvelteKit to Python): Stable, versioned at /rapid-ai/v1/, engineering-focused. Changes when the diagnostic engine evolves.
Frontend contract (Browser to SvelteKit): UI-focused, shaped for specific views. Changes when the interface evolves.

These contracts change for different reasons at different times. The BFF absorbs the mismatch, allowing the diagnostic engine and the user interface to evolve independently.

31.2 Authentication Architecture

better-auth Integration

RAPID AI uses better-auth for authentication, running entirely on the SvelteKit server. The library provides:

Session management. Sessions are stored server-side in PostgreSQL (via Drizzle ORM). The browser receives an httpOnly, Secure, SameSite=Lax cookie containing only the session ID. No tokens, no user data, no role information is stored client-side.
OAuth providers. Google, Microsoft, and GitHub OAuth for enterprise SSO. The OAuth callback is handled server-side; the browser never sees the OAuth token exchange.
Email and password. For organizations that do not use SSO. Passwords are hashed with argon2id. Password reset flows use time-limited, single-use tokens.
Session rotation. Sessions are rotated on privilege escalation (role change, password change) to prevent session fixation attacks.

Role-Based Access Control

RAPID AI defines four roles with increasing privilege:

Role	Description	Access Scope
Operator	Plant floor personnel	View dashboards, assigned assets only, simplified diagnostics
Reliability Engineer	Domain expert	Full diagnostic detail, RCM workbooks, copilot, rule citations
Manager	Maintenance/plant manager	Fleet summaries, cost impact, maintenance scheduling, reporting
Administrator	System admin	User management, asset configuration, rule deployment, audit logs

Role enforcement happens at two levels:

SvelteKit server routes check the user’s role from the session before calling the Python backend. A route for /admin/users rejects non-administrators with a 403 before any backend call is made.
Python FastAPI middleware validates the role claim in the service JWT. Even if an attacker bypasses the BFF (which requires compromising the internal network), the Python backend independently verifies authorization.

Internal Service-to-Service JWT

Communication between SvelteKit and Python uses short-lived JWTs:

Algorithm: RS256 (asymmetric). The SvelteKit server holds the private key; the Python backend validates with the public key.
Expiry: 5 minutes. Tokens are generated per-request, not cached.
Claims: iss (SvelteKit service identifier), sub (authenticated user ID), roles (array of user roles), tenant_id (for multi-tenant deployments), exp (expiration timestamp).
No refresh tokens. Service JWTs are ephemeral. If the SvelteKit server is compromised, the attacker can generate valid JWTs only as long as they hold the private key. Key rotation invalidates all future tokens instantly.

JWTs are used strictly for internal service-to-service communication. They are never sent to the browser and never stored in cookies or localStorage.

31.3 API Protection Layers

Security is implemented as defense in depth — four concentric layers, each independently capable of rejecting malicious requests.

Layer 1: SvelteKit Middleware

The outermost layer runs on every incoming request before any route handler executes:

Authentication check. Unauthenticated requests to protected routes receive a 302 redirect to the login page (for page requests) or a 401 JSON response (for API requests).
CSRF protection. SvelteKit’s built-in CSRF protection validates the Origin header on state-changing requests. Custom CSRF tokens are used for forms.
Rate limiting. Token bucket algorithm per authenticated user. Default limits: 100 requests/minute for read endpoints, 20 requests/minute for write endpoints, 5 requests/minute for diagnostic pipeline triggers.
Request size limits. 1 MB maximum body size. Requests exceeding this limit are rejected before parsing.
Security headers. Content-Security-Policy, X-Content-Type-Options: nosniff, X-Frame-Options: DENY, Strict-Transport-Security with a 1-year max-age.

Layer 2: SvelteKit Server Routes

Each +page.server.ts and +server.ts file implements route-specific protection:

Input validation. Zod schemas parse and validate every field. Asset IDs are validated as UUIDs. Date ranges are checked for sanity. Sensor tag names are validated against a whitelist.
Role authorization. The route checks whether the user’s role permits the requested operation.
Data transformation. Responses are filtered by role before returning to the browser. An operator never receives raw confidence scores or rule IDs, even if the Python backend includes them.
Audit logging. Every request is logged with: request ID (UUID), user ID, user role, endpoint, HTTP method, response status, and processing time. For write operations, the request payload is logged (with sensitive fields redacted).

Layer 3: FastAPI Middleware

The Python backend validates every incoming request independently:

Service JWT verification. The Authorization: Bearer <jwt> header is validated on every request. Invalid, expired, or missing tokens return 401.
API key verification. For Mode B (API-only) deployments, requests are authenticated via X-API-Key header. Keys are scoped to specific operations and rate-limited independently.
Pydantic model validation. Every request body is parsed through a Pydantic model. Invalid payloads return 422 with detailed validation errors (in development) or generic errors (in production, to avoid information leakage).
Request ID propagation. The X-Request-Id header from the BFF is propagated through all internal calls and into database audit records, enabling end-to-end tracing.

Layer 4: Database-Level Row-Level Security

For multi-tenant deployments, PostgreSQL row-level security (RLS) provides the final protection layer:

-- Every tenant-scoped table has a tenant_id column
ALTER TABLE assets ENABLE ROW LEVEL SECURITY;

CREATE POLICY tenant_isolation ON assets
  USING (tenant_id = current_setting('app.current_tenant_id')::uuid);

-- Set per-connection before any queries
SET app.current_tenant_id = '<tenant-uuid>';

RLS ensures that even if application-level authorization has a bug, one tenant’s data is invisible to another. The tenant_id is set from the JWT’s tenant_id claim at the beginning of each database transaction. Application code cannot override it.

31.4 Deployment Architecture

Network Topology

                    INTERNET
                       |
                       v
              +------------------+
              |  Load Balancer   |
              |  (TLS termination)|
              +--------+---------+
                       |
          PUBLIC NETWORK (DMZ)
                       |
              +--------+---------+
              |  SvelteKit BFF   |
              |  Bun 1.3         |
              |  port 5173       |
              +--------+---------+
                       |
     ---- PRIVATE NETWORK BOUNDARY ----
                       |
         +-------------+-------------+
         |                           |
+--------+---------+   +------------+----------+
|  Python FastAPI  |   |  PostgreSQL 17        |
|  Python 3.13     |   |  + pgvector           |
|  port 8000       |   |  port 5432            |
|  (internal only) |   |  (internal only)      |
+------------------+   +--+-----------+--------+
                           |           |
                     +-----+     +-----+------+
                     | Data |    | Vector     |
                     | Tables|   | Embeddings |
                     +------+    +------------+

Service Deployment Options

SvelteKit BFF is edge-deployable, adapting to multiple hosting environments:

Platform	Adapter	Notes
Cloudflare Workers	`@sveltejs/adapter-cloudflare`	Edge deployment, sub-50ms TTFB globally
Vercel	`@sveltejs/adapter-vercel`	Serverless functions, automatic scaling
Node/Bun	`@sveltejs/adapter-node`	Traditional server, full WebSocket support
Docker	`@sveltejs/adapter-node` + Dockerfile	Self-hosted, full control

Python FastAPI runs as a containerized service on the private network:

Docker image: Python 3.13 slim base, multi-stage build. Final image under 300 MB.
Process manager: uvicorn with 4 workers (auto-scaled to CPU count in production).
Health endpoint: GET /rapid-ai/v1/health returns service status, database connectivity, and LLM provider reachability.
Never exposed to public internet. No published ports. Accessible only from the SvelteKit container on the shared Docker network, or via ClusterIP in Kubernetes.

PostgreSQL 17 + pgvector runs as a managed instance or self-hosted:

Option	Use Case
Supabase	Managed Postgres with built-in pgvector, auth helpers, real-time subscriptions
Neon	Serverless Postgres with branching, scales to zero for development
Self-hosted (Docker)	Full control, air-gapped deployments for sensitive industrial environments

31.5 Dual-Mode Deployment

RAPID AI supports two deployment modes to serve different customer segments.

Mode A: Full Product

The standard deployment for end users who interact through the browser.

Browser  -->  SvelteKit BFF  -->  Python FastAPI  -->  PostgreSQL
                  |
            better-auth sessions
            SSR dashboards
            SSE real-time updates
            Copilot WebSocket

All traffic flows through the BFF. Users authenticate with sessions. The Python backend is invisible to the outside world. This is the deployment described throughout this book.

Mode B: API-Only

For enterprise customers who integrate RAPID AI’s diagnostic intelligence into their existing systems (CMMS, SCADA historians, custom dashboards).

Enterprise System  -->  Python FastAPI  -->  PostgreSQL
                            |
                      API key auth (X-API-Key)
                      Rate limiting per key
                      OpenAPI documentation

In Mode B, the Python FastAPI backend is exposed directly (with API key authentication) and the SvelteKit layer is not deployed. This mode is used when:

The customer has their own frontend and only needs the diagnostic engine.
Integration is machine-to-machine (CMMS pulling maintenance recommendations, historian pushing sensor data).
The customer’s security policy requires their own authentication layer in front of all services.

Shared Core

Both modes use the same:

Diagnostic pipeline (Modules A through E)
Rule evaluation engine (451+ rules)
SEDL entropy computation
AESF stability framework
FRETTLSM causal taxonomy
CDE contradiction detection
PostgreSQL schema and data
Pydantic validation models

The only difference is the security and presentation layer. Mode A adds the BFF’s transformation, caching, SSR, and session management. Mode B exposes the raw engineering API with API key protection.

Configuration

The deployment mode is controlled by environment variables and Docker Compose profiles:

services:
  sveltekit:
    profiles: ["full"]        # Only in Mode A
    build: ./frontend
    ports:
      - "5173:5173"
    networks:
      - public
      - internal

  fastapi:
    build: ./backend
    ports:
      - "${API_PORT:-}"       # Empty in Mode A (internal only)
    networks:
      - internal
    environment:
      - DEPLOYMENT_MODE=${DEPLOYMENT_MODE:-full}
      - API_KEY_AUTH_ENABLED=${API_KEY_AUTH:-false}

  postgres:
    image: pgvector/pgvector:pg17
    networks:
      - internal

Mode A: docker compose --profile full up — SvelteKit is deployed, FastAPI has no published ports.

Mode B: docker compose up fastapi postgres — FastAPI port is published, API key auth is enabled, SvelteKit is not deployed.

31.6 Security Hardening

OWASP Top 10 Mitigations

Each OWASP Top 10 risk is addressed with specific controls relevant to RAPID AI’s architecture:

OWASP Risk	Mitigation
A01: Broken Access Control	Role-based access at both SvelteKit and FastAPI layers. RLS at database level. No direct object references — all IDs are UUIDs, never sequential integers.
A02: Cryptographic Failures	TLS everywhere (even internal in enterprise deployments). Passwords hashed with argon2id. Service JWTs signed with RS256. No secrets in code — environment variables only.
A03: Injection	Drizzle ORM (TypeScript) and SQLAlchemy Core (Python) use parameterized queries exclusively. No raw SQL string concatenation anywhere in the codebase. Zod and Pydantic validate all inputs before they reach query builders.
A04: Insecure Design	BFF pattern isolates the diagnostic engine. Defense in depth with four validation layers. Threat modeling performed per deployment topology.
A05: Security Misconfiguration	Production Docker images run as non-root. Default credentials are rejected at startup. Security headers are set in middleware, not left to deployment configuration.
A06: Vulnerable Components	`npm audit` and `pip audit` run in CI. Dependabot monitors for CVEs. Base Docker images are pinned to specific digests, not floating tags.
A07: Auth Failures	better-auth handles session management with secure defaults. Session rotation on privilege changes. Account lockout after 5 failed attempts with exponential backoff.
A08: Data Integrity Failures	Docker images are signed. CI/CD pipeline requires passing tests before deployment. Database migrations are version-controlled and reviewed.
A09: Logging Failures	Structured JSON logging at every layer. Request IDs enable end-to-end tracing. Audit logs for all diagnostic decisions are immutable (append-only table with no DELETE permission).
A10: SSRF	The SvelteKit BFF only calls the Python backend at a hardcoded internal URL. No user-supplied URLs are fetched server-side. The Python backend has no outbound HTTP capability except to configured LLM providers and the database.

Input Validation at Every Boundary

Validation is never skipped, even between trusted services:

Browser Input
    |
    v
Zod Schema (SvelteKit)          # TypeScript validation
    |  Rejects: wrong types, missing fields, invalid ranges
    v
Service JWT (contains validated user context)
    |
    v
Pydantic Model (FastAPI)        # Python validation
    |  Rejects: wrong types, constraint violations
    v
SQLAlchemy Parameterized Query  # Database validation
    |  Rejects: type mismatches, constraint violations
    v
PostgreSQL CHECK Constraints    # Final enforcement
    |  Rejects: domain violations
    v
RLS Policy                      # Tenant isolation

If an attacker bypasses Zod validation (by crafting raw HTTP requests), Pydantic catches it. If Pydantic has a bug, PostgreSQL constraints catch it. If the application has a tenant isolation bug, RLS catches it. No single layer failure compromises the system.

SQL Injection Prevention

Neither ORM permits raw SQL string interpolation:

// Drizzle ORM (SvelteKit) -- parameterized by design
const assets = await db
  .select()
  .from(assetsTable)
  .where(eq(assetsTable.tenantId, tenantId));

# SQLAlchemy Core (FastAPI) -- parameterized by design
stmt = select(assets_table).where(
    assets_table.c.tenant_id == tenant_id
)
result = await conn.execute(stmt)

Both ORMs generate parameterized SQL. The tenant_id value is passed as a bind parameter, never interpolated into the query string. A CI linter flags any use of raw SQL strings or f-string query construction.

Secrets Management

RAPID AI follows a strict no-secrets-in-code policy:

Environment variables for all secrets: database URLs, API keys, JWT private keys, LLM provider keys, OAuth client secrets.
.env files are listed in .gitignore and .dockerignore. A .env.example file documents required variables without values.
Docker secrets for production deployments. Secrets are mounted as files in /run/secrets/, not passed as environment variables (which can leak via /proc).
Key rotation is supported without downtime. The Python backend accepts multiple JWT public keys during rotation windows. Old keys are removed after all in-flight tokens expire.

Audit Trail for Diagnostic Decisions

Every diagnostic decision is recorded in an append-only audit table:

CREATE TABLE diagnostic_audit_log (
    id            UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    request_id    UUID NOT NULL,
    user_id       UUID NOT NULL,
    asset_id      UUID NOT NULL,
    tenant_id     UUID NOT NULL,
    action        TEXT NOT NULL,       -- 'pipeline_run', 'copilot_query', 'rcm_decision'
    input_hash    TEXT NOT NULL,       -- SHA-256 of request payload
    output_hash   TEXT NOT NULL,       -- SHA-256 of response payload
    result_summary JSONB NOT NULL,     -- Key findings (state, top faults, confidence)
    created_at    TIMESTAMPTZ NOT NULL DEFAULT now()
);

-- No UPDATE or DELETE permissions granted to the application role
REVOKE UPDATE, DELETE ON diagnostic_audit_log FROM app_user;

This table answers the question every plant manager asks after an incident: “What did the system recommend, when, and based on what data?” The input_hash and output_hash prove that the logged summary matches the actual request and response, even if the full payloads are stored separately.

31.7 Monitoring and Observability

Structured Logging

All services emit structured JSON logs with a consistent schema:

{
  "timestamp": "2026-03-14T19:00:00.342Z",
  "level": "info",
  "service": "sveltekit-bff",
  "request_id": "req_a1b2c3d4",
  "user_id": "usr_x9y8z7",
  "method": "GET",
  "path": "/dashboard",
  "status": 200,
  "duration_ms": 142,
  "backend_duration_ms": 98,
  "cache_hit": true,
  "message": "Dashboard loaded"
}

The request_id is generated at the BFF layer and propagated to the Python backend via the X-Request-Id header. Both services log the same ID, enabling end-to-end tracing from browser click to database query. Log aggregation (via Loki, CloudWatch, or Datadog) can reconstruct the full request lifecycle.

Health Checks and Readiness Probes

Each service exposes health endpoints for orchestrators:

Service	Endpoint	Checks
SvelteKit BFF	`GET /api/health`	Server responsive, Python backend reachable, Redis connected
Python FastAPI	`GET /rapid-ai/v1/health`	Server responsive, database connected, LLM provider reachable
PostgreSQL	TCP check on port 5432	Accepting connections

Health checks distinguish between liveness (is the process running?) and readiness (can it serve traffic?). A Python backend that has lost its database connection reports liveness: true, readiness: false — the orchestrator stops sending traffic but does not restart the container, giving the database time to recover.

Performance Metrics

RAPID AI tracks P95 latency targets for every endpoint category:

Endpoint Category	P95 Target	Notes
Dashboard page load (SSR)	< 500 ms	Includes BFF transformation + Python call
Asset detail page	< 300 ms	Single asset, cached after first load
Diagnostic pipeline trigger	< 5,000 ms	Full A-through-E pipeline, compute-intensive
Copilot response (first token)	< 2,000 ms	LLM inference, streamed via WebSocket
Fleet summary API	< 1,000 ms	Aggregation of all assets, cached for 1 minute
SSE event delivery	< 200 ms	From Python event to browser render

Metrics are collected via Prometheus-compatible endpoints (/metrics on each service) and visualized in Grafana dashboards. Alerts fire when P95 latency exceeds 2x the target for 5 consecutive minutes.

Correlation and Tracing

The request lifecycle spans three services. Correlation works as follows:

Browser: fetch("/dashboard")
    |
    |  Cookie: session_id=abc123
    v
SvelteKit BFF:
    |  Generate request_id: req_a1b2c3d4
    |  Log: { request_id, user_id, path, ... }
    |  Set header: X-Request-Id: req_a1b2c3d4
    v
Python FastAPI:
    |  Read header: X-Request-Id: req_a1b2c3d4
    |  Log: { request_id, endpoint, duration, ... }
    |  Set PostgreSQL: SET app.request_id = 'req_a1b2c3d4'
    v
PostgreSQL:
    |  Audit log row: { request_id: req_a1b2c3d4, ... }

A single request_id links the browser request, the BFF log entry, the Python log entry, and the database audit record. When an engineer reports “the dashboard showed the wrong state for P-101 at 2 PM,” the support team searches for the request ID and reconstructs exactly what data was returned and why.

31.8 Summary

The BFF and security architecture is not a feature of RAPID AI — it is the reason RAPID AI can be deployed in production industrial environments where diagnostic decisions affect physical safety. The key design decisions:

The Python backend is never exposed to the public internet. The SvelteKit BFF is the sole entry point for browser traffic, enforcing authentication, validation, and rate limiting before any request reaches the diagnostic engine.
Authentication is server-side only. httpOnly cookies, server-side sessions, and internal JWTs. No tokens in localStorage, no credentials in the browser.
Defense in depth across four layers. SvelteKit middleware, SvelteKit server routes, FastAPI middleware, and PostgreSQL RLS. No single layer failure compromises the system.
Dual-mode deployment lets the same diagnostic engine serve browser users (Mode A, through the BFF) and enterprise integrations (Mode B, direct API access with API keys).
Every diagnostic decision is auditable. Immutable audit logs with end-to-end request IDs link browser actions to database records.
The system degrades gracefully. Cached data, stale indicators, and health check polling ensure that engineers always see something useful, even when components fail.

The next chapter addresses the operational lifecycle: how RAPID AI is built, tested, deployed, and maintained in production.

Standards Alignment

Standard	Relevance to This Chapter
OWASP Top 10 — Web application security	The BFF security architecture addresses all OWASP Top 10 categories: A01 (Broken Access Control via RBAC), A02 (Cryptographic Failures via TLS/JWT), A03 (Injection via safe rule parser), A04 (Insecure Design via BFF pattern), A05 (Security Misconfiguration via environment-based config), A07 (Auth Failures via better-auth), and A09 (Logging via immutable audit trails).
IEC 62443 — Industrial cybersecurity	The zone-and-conduit architecture (browser zone, BFF zone, engine zone) implements IEC 62443’s network segmentation requirements with strict firewall rules ensuring the diagnostic engine is never exposed to the public internet.
NIST SP 800-82 — Guide to Industrial Control Systems security	The two deployment modes (cloud-hosted and on-premise) follow NIST SP 800-82 guidance for securing industrial systems in both enterprise IT and operational technology environments.

Changelog

Version	Date	Author	Changes
2.1.0	2026-03-17	Rick D	Added standards alignment, living doc metadata, changelog
2.0.0	2026-03-17	Rick D	Enriched with production codebase content
1.0.0	2026-03-17	Rick D	Initial chapter creation