Skip to content

BFF and Security Architecture

Chapter 31 — Backend-for-Frontend and Security Architecture

Section titled “Chapter 31 — Backend-for-Frontend and Security Architecture”

Chapter 14 established RAPID AI’s three-service topology: a Python FastAPI engine for diagnostic intelligence, a SvelteKit Explorer for the frontend and BFF layer, and PostgreSQL with pgvector for persistence. Chapter 20 documented the API contract between them. Chapter 30 described how the agentic copilot reasons about diagnostics through tool calls. This chapter closes the architectural story by explaining how these services are secured, how the BFF pattern protects the diagnostic engine from public exposure, and how the platform deploys in two distinct modes to serve both end users and enterprise integrations.

Security in an industrial diagnostic platform is not an afterthought. A compromised RAPID AI instance could issue false diagnostic clearances, suppress real failure warnings, or leak proprietary IMS rule libraries. The architecture described here treats every network boundary as hostile and every input as untrusted.


The Python FastAPI backend is RAPID AI’s intellectual core. It contains 451+ physics-based diagnostic rules, the SEDL entropy engine, the AESF stability framework, the FRETTLSM causal taxonomy, Weibull RUL estimation, and the CDE contradiction engine. Exposing this directly to the public internet would create an unacceptable attack surface — not because FastAPI is insecure, but because an engineering intelligence engine should not also be responsible for session management, CSRF protection, rate limiting, locale formatting, and browser-specific concerns.

The SvelteKit server sits between the browser and the Python backend as a deliberate architectural boundary:

Browser (Svelte 5 client)
|
| HTTPS, session cookie, CSRF token
|
v
SvelteKit Server (BFF Layer) <-- PUBLIC INTERNET BOUNDARY
| Bun 1.3, port 5173
|
| HTTP, service JWT (RS256)
| Internal network only
v
Python FastAPI Backend <-- PRIVATE NETWORK ONLY
| Python 3.13, port 8000
|
v
PostgreSQL 17 + pgvector <-- PRIVATE NETWORK ONLY
port 5432

The browser sends requests to SvelteKit server routes (+page.server.ts and +server.ts files). These routes authenticate the user, validate the request, call the Python backend over the internal network, transform the response into UI-ready data, and return it to the browser. The Python backend binds to 127.0.0.1 or a Docker internal network. It is unreachable from outside the server boundary in every deployment topology.

The BFF is not a thin reverse proxy. It is a purposeful transformation and security layer with six responsibilities:

  1. Authentication and session management. The better-auth library runs on the SvelteKit server. It issues httpOnly cookies, manages OAuth flows, and stores sessions server-side. The Python backend never sees raw user credentials.

  2. Authorization enforcement. Every server route checks the user’s role before calling the Python backend. An operator cannot access the rule management API. A manager cannot trigger a raw diagnostic pipeline run.

  3. Request validation. Zod schemas validate every incoming request on the SvelteKit server before it reaches Python. Malformed requests are rejected at the boundary, not forwarded to the engine.

  4. Data transformation. The Python backend returns engineering-correct output: raw confidence scores, AESF state codes, Weibull parameters, entropy vectors. The BFF transforms these into UI-ready data with labels, colors, icons, and role-appropriate detail levels.

  5. API aggregation. A fleet dashboard requires data from every asset. The BFF calls multiple Python endpoints in parallel over the internal network and returns a single merged response to the browser, eliminating dozens of round trips.

  6. Rate limiting and abuse protection. Per-user and per-endpoint rate limits are enforced at the SvelteKit layer. The Python backend is shielded from traffic spikes.

The BFF creates two independent API contracts:

  • Internal contract (SvelteKit to Python): Stable, versioned at /rapid-ai/v1/, engineering-focused. Changes when the diagnostic engine evolves.
  • Frontend contract (Browser to SvelteKit): UI-focused, shaped for specific views. Changes when the interface evolves.

These contracts change for different reasons at different times. The BFF absorbs the mismatch, allowing the diagnostic engine and the user interface to evolve independently.


RAPID AI uses better-auth for authentication, running entirely on the SvelteKit server. The library provides:

  • Session management. Sessions are stored server-side in PostgreSQL (via Drizzle ORM). The browser receives an httpOnly, Secure, SameSite=Lax cookie containing only the session ID. No tokens, no user data, no role information is stored client-side.
  • OAuth providers. Google, Microsoft, and GitHub OAuth for enterprise SSO. The OAuth callback is handled server-side; the browser never sees the OAuth token exchange.
  • Email and password. For organizations that do not use SSO. Passwords are hashed with argon2id. Password reset flows use time-limited, single-use tokens.
  • Session rotation. Sessions are rotated on privilege escalation (role change, password change) to prevent session fixation attacks.

RAPID AI defines four roles with increasing privilege:

RoleDescriptionAccess Scope
OperatorPlant floor personnelView dashboards, assigned assets only, simplified diagnostics
Reliability EngineerDomain expertFull diagnostic detail, RCM workbooks, copilot, rule citations
ManagerMaintenance/plant managerFleet summaries, cost impact, maintenance scheduling, reporting
AdministratorSystem adminUser management, asset configuration, rule deployment, audit logs

Role enforcement happens at two levels:

  1. SvelteKit server routes check the user’s role from the session before calling the Python backend. A route for /admin/users rejects non-administrators with a 403 before any backend call is made.
  2. Python FastAPI middleware validates the role claim in the service JWT. Even if an attacker bypasses the BFF (which requires compromising the internal network), the Python backend independently verifies authorization.

Communication between SvelteKit and Python uses short-lived JWTs:

  • Algorithm: RS256 (asymmetric). The SvelteKit server holds the private key; the Python backend validates with the public key.
  • Expiry: 5 minutes. Tokens are generated per-request, not cached.
  • Claims: iss (SvelteKit service identifier), sub (authenticated user ID), roles (array of user roles), tenant_id (for multi-tenant deployments), exp (expiration timestamp).
  • No refresh tokens. Service JWTs are ephemeral. If the SvelteKit server is compromised, the attacker can generate valid JWTs only as long as they hold the private key. Key rotation invalidates all future tokens instantly.

JWTs are used strictly for internal service-to-service communication. They are never sent to the browser and never stored in cookies or localStorage.


Security is implemented as defense in depth — four concentric layers, each independently capable of rejecting malicious requests.

The outermost layer runs on every incoming request before any route handler executes:

  • Authentication check. Unauthenticated requests to protected routes receive a 302 redirect to the login page (for page requests) or a 401 JSON response (for API requests).
  • CSRF protection. SvelteKit’s built-in CSRF protection validates the Origin header on state-changing requests. Custom CSRF tokens are used for forms.
  • Rate limiting. Token bucket algorithm per authenticated user. Default limits: 100 requests/minute for read endpoints, 20 requests/minute for write endpoints, 5 requests/minute for diagnostic pipeline triggers.
  • Request size limits. 1 MB maximum body size. Requests exceeding this limit are rejected before parsing.
  • Security headers. Content-Security-Policy, X-Content-Type-Options: nosniff, X-Frame-Options: DENY, Strict-Transport-Security with a 1-year max-age.

Each +page.server.ts and +server.ts file implements route-specific protection:

  • Input validation. Zod schemas parse and validate every field. Asset IDs are validated as UUIDs. Date ranges are checked for sanity. Sensor tag names are validated against a whitelist.
  • Role authorization. The route checks whether the user’s role permits the requested operation.
  • Data transformation. Responses are filtered by role before returning to the browser. An operator never receives raw confidence scores or rule IDs, even if the Python backend includes them.
  • Audit logging. Every request is logged with: request ID (UUID), user ID, user role, endpoint, HTTP method, response status, and processing time. For write operations, the request payload is logged (with sensitive fields redacted).

The Python backend validates every incoming request independently:

  • Service JWT verification. The Authorization: Bearer <jwt> header is validated on every request. Invalid, expired, or missing tokens return 401.
  • API key verification. For Mode B (API-only) deployments, requests are authenticated via X-API-Key header. Keys are scoped to specific operations and rate-limited independently.
  • Pydantic model validation. Every request body is parsed through a Pydantic model. Invalid payloads return 422 with detailed validation errors (in development) or generic errors (in production, to avoid information leakage).
  • Request ID propagation. The X-Request-Id header from the BFF is propagated through all internal calls and into database audit records, enabling end-to-end tracing.

Layer 4: Database-Level Row-Level Security

Section titled “Layer 4: Database-Level Row-Level Security”

For multi-tenant deployments, PostgreSQL row-level security (RLS) provides the final protection layer:

-- Every tenant-scoped table has a tenant_id column
ALTER TABLE assets ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON assets
USING (tenant_id = current_setting('app.current_tenant_id')::uuid);
-- Set per-connection before any queries
SET app.current_tenant_id = '<tenant-uuid>';

RLS ensures that even if application-level authorization has a bug, one tenant’s data is invisible to another. The tenant_id is set from the JWT’s tenant_id claim at the beginning of each database transaction. Application code cannot override it.


INTERNET
|
v
+------------------+
| Load Balancer |
| (TLS termination)|
+--------+---------+
|
PUBLIC NETWORK (DMZ)
|
+--------+---------+
| SvelteKit BFF |
| Bun 1.3 |
| port 5173 |
+--------+---------+
|
---- PRIVATE NETWORK BOUNDARY ----
|
+-------------+-------------+
| |
+--------+---------+ +------------+----------+
| Python FastAPI | | PostgreSQL 17 |
| Python 3.13 | | + pgvector |
| port 8000 | | port 5432 |
| (internal only) | | (internal only) |
+------------------+ +--+-----------+--------+
| |
+-----+ +-----+------+
| Data | | Vector |
| Tables| | Embeddings |
+------+ +------------+

SvelteKit BFF is edge-deployable, adapting to multiple hosting environments:

PlatformAdapterNotes
Cloudflare Workers@sveltejs/adapter-cloudflareEdge deployment, sub-50ms TTFB globally
Vercel@sveltejs/adapter-vercelServerless functions, automatic scaling
Node/Bun@sveltejs/adapter-nodeTraditional server, full WebSocket support
Docker@sveltejs/adapter-node + DockerfileSelf-hosted, full control

Python FastAPI runs as a containerized service on the private network:

  • Docker image: Python 3.13 slim base, multi-stage build. Final image under 300 MB.
  • Process manager: uvicorn with 4 workers (auto-scaled to CPU count in production).
  • Health endpoint: GET /rapid-ai/v1/health returns service status, database connectivity, and LLM provider reachability.
  • Never exposed to public internet. No published ports. Accessible only from the SvelteKit container on the shared Docker network, or via ClusterIP in Kubernetes.

PostgreSQL 17 + pgvector runs as a managed instance or self-hosted:

OptionUse Case
SupabaseManaged Postgres with built-in pgvector, auth helpers, real-time subscriptions
NeonServerless Postgres with branching, scales to zero for development
Self-hosted (Docker)Full control, air-gapped deployments for sensitive industrial environments

RAPID AI supports two deployment modes to serve different customer segments.

The standard deployment for end users who interact through the browser.

Browser --> SvelteKit BFF --> Python FastAPI --> PostgreSQL
|
better-auth sessions
SSR dashboards
SSE real-time updates
Copilot WebSocket

All traffic flows through the BFF. Users authenticate with sessions. The Python backend is invisible to the outside world. This is the deployment described throughout this book.

For enterprise customers who integrate RAPID AI’s diagnostic intelligence into their existing systems (CMMS, SCADA historians, custom dashboards).

Enterprise System --> Python FastAPI --> PostgreSQL
|
API key auth (X-API-Key)
Rate limiting per key
OpenAPI documentation

In Mode B, the Python FastAPI backend is exposed directly (with API key authentication) and the SvelteKit layer is not deployed. This mode is used when:

  • The customer has their own frontend and only needs the diagnostic engine.
  • Integration is machine-to-machine (CMMS pulling maintenance recommendations, historian pushing sensor data).
  • The customer’s security policy requires their own authentication layer in front of all services.

Both modes use the same:

  • Diagnostic pipeline (Modules A through E)
  • Rule evaluation engine (451+ rules)
  • SEDL entropy computation
  • AESF stability framework
  • FRETTLSM causal taxonomy
  • CDE contradiction detection
  • PostgreSQL schema and data
  • Pydantic validation models

The only difference is the security and presentation layer. Mode A adds the BFF’s transformation, caching, SSR, and session management. Mode B exposes the raw engineering API with API key protection.

The deployment mode is controlled by environment variables and Docker Compose profiles:

docker-compose.yml
services:
sveltekit:
profiles: ["full"] # Only in Mode A
build: ./frontend
ports:
- "5173:5173"
networks:
- public
- internal
fastapi:
build: ./backend
ports:
- "${API_PORT:-}" # Empty in Mode A (internal only)
networks:
- internal
environment:
- DEPLOYMENT_MODE=${DEPLOYMENT_MODE:-full}
- API_KEY_AUTH_ENABLED=${API_KEY_AUTH:-false}
postgres:
image: pgvector/pgvector:pg17
networks:
- internal

Mode A: docker compose --profile full up — SvelteKit is deployed, FastAPI has no published ports.

Mode B: docker compose up fastapi postgres — FastAPI port is published, API key auth is enabled, SvelteKit is not deployed.


Each OWASP Top 10 risk is addressed with specific controls relevant to RAPID AI’s architecture:

OWASP RiskMitigation
A01: Broken Access ControlRole-based access at both SvelteKit and FastAPI layers. RLS at database level. No direct object references — all IDs are UUIDs, never sequential integers.
A02: Cryptographic FailuresTLS everywhere (even internal in enterprise deployments). Passwords hashed with argon2id. Service JWTs signed with RS256. No secrets in code — environment variables only.
A03: InjectionDrizzle ORM (TypeScript) and SQLAlchemy Core (Python) use parameterized queries exclusively. No raw SQL string concatenation anywhere in the codebase. Zod and Pydantic validate all inputs before they reach query builders.
A04: Insecure DesignBFF pattern isolates the diagnostic engine. Defense in depth with four validation layers. Threat modeling performed per deployment topology.
A05: Security MisconfigurationProduction Docker images run as non-root. Default credentials are rejected at startup. Security headers are set in middleware, not left to deployment configuration.
A06: Vulnerable Componentsnpm audit and pip audit run in CI. Dependabot monitors for CVEs. Base Docker images are pinned to specific digests, not floating tags.
A07: Auth Failuresbetter-auth handles session management with secure defaults. Session rotation on privilege changes. Account lockout after 5 failed attempts with exponential backoff.
A08: Data Integrity FailuresDocker images are signed. CI/CD pipeline requires passing tests before deployment. Database migrations are version-controlled and reviewed.
A09: Logging FailuresStructured JSON logging at every layer. Request IDs enable end-to-end tracing. Audit logs for all diagnostic decisions are immutable (append-only table with no DELETE permission).
A10: SSRFThe SvelteKit BFF only calls the Python backend at a hardcoded internal URL. No user-supplied URLs are fetched server-side. The Python backend has no outbound HTTP capability except to configured LLM providers and the database.

Validation is never skipped, even between trusted services:

Browser Input
|
v
Zod Schema (SvelteKit) # TypeScript validation
| Rejects: wrong types, missing fields, invalid ranges
v
Service JWT (contains validated user context)
|
v
Pydantic Model (FastAPI) # Python validation
| Rejects: wrong types, constraint violations
v
SQLAlchemy Parameterized Query # Database validation
| Rejects: type mismatches, constraint violations
v
PostgreSQL CHECK Constraints # Final enforcement
| Rejects: domain violations
v
RLS Policy # Tenant isolation

If an attacker bypasses Zod validation (by crafting raw HTTP requests), Pydantic catches it. If Pydantic has a bug, PostgreSQL constraints catch it. If the application has a tenant isolation bug, RLS catches it. No single layer failure compromises the system.

Neither ORM permits raw SQL string interpolation:

// Drizzle ORM (SvelteKit) -- parameterized by design
const assets = await db
.select()
.from(assetsTable)
.where(eq(assetsTable.tenantId, tenantId));
# SQLAlchemy Core (FastAPI) -- parameterized by design
stmt = select(assets_table).where(
assets_table.c.tenant_id == tenant_id
)
result = await conn.execute(stmt)

Both ORMs generate parameterized SQL. The tenant_id value is passed as a bind parameter, never interpolated into the query string. A CI linter flags any use of raw SQL strings or f-string query construction.

RAPID AI follows a strict no-secrets-in-code policy:

  • Environment variables for all secrets: database URLs, API keys, JWT private keys, LLM provider keys, OAuth client secrets.
  • .env files are listed in .gitignore and .dockerignore. A .env.example file documents required variables without values.
  • Docker secrets for production deployments. Secrets are mounted as files in /run/secrets/, not passed as environment variables (which can leak via /proc).
  • Key rotation is supported without downtime. The Python backend accepts multiple JWT public keys during rotation windows. Old keys are removed after all in-flight tokens expire.

Every diagnostic decision is recorded in an append-only audit table:

CREATE TABLE diagnostic_audit_log (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
request_id UUID NOT NULL,
user_id UUID NOT NULL,
asset_id UUID NOT NULL,
tenant_id UUID NOT NULL,
action TEXT NOT NULL, -- 'pipeline_run', 'copilot_query', 'rcm_decision'
input_hash TEXT NOT NULL, -- SHA-256 of request payload
output_hash TEXT NOT NULL, -- SHA-256 of response payload
result_summary JSONB NOT NULL, -- Key findings (state, top faults, confidence)
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- No UPDATE or DELETE permissions granted to the application role
REVOKE UPDATE, DELETE ON diagnostic_audit_log FROM app_user;

This table answers the question every plant manager asks after an incident: “What did the system recommend, when, and based on what data?” The input_hash and output_hash prove that the logged summary matches the actual request and response, even if the full payloads are stored separately.


All services emit structured JSON logs with a consistent schema:

{
"timestamp": "2026-03-14T19:00:00.342Z",
"level": "info",
"service": "sveltekit-bff",
"request_id": "req_a1b2c3d4",
"user_id": "usr_x9y8z7",
"method": "GET",
"path": "/dashboard",
"status": 200,
"duration_ms": 142,
"backend_duration_ms": 98,
"cache_hit": true,
"message": "Dashboard loaded"
}

The request_id is generated at the BFF layer and propagated to the Python backend via the X-Request-Id header. Both services log the same ID, enabling end-to-end tracing from browser click to database query. Log aggregation (via Loki, CloudWatch, or Datadog) can reconstruct the full request lifecycle.

Each service exposes health endpoints for orchestrators:

ServiceEndpointChecks
SvelteKit BFFGET /api/healthServer responsive, Python backend reachable, Redis connected
Python FastAPIGET /rapid-ai/v1/healthServer responsive, database connected, LLM provider reachable
PostgreSQLTCP check on port 5432Accepting connections

Health checks distinguish between liveness (is the process running?) and readiness (can it serve traffic?). A Python backend that has lost its database connection reports liveness: true, readiness: false — the orchestrator stops sending traffic but does not restart the container, giving the database time to recover.

RAPID AI tracks P95 latency targets for every endpoint category:

Endpoint CategoryP95 TargetNotes
Dashboard page load (SSR)< 500 msIncludes BFF transformation + Python call
Asset detail page< 300 msSingle asset, cached after first load
Diagnostic pipeline trigger< 5,000 msFull A-through-E pipeline, compute-intensive
Copilot response (first token)< 2,000 msLLM inference, streamed via WebSocket
Fleet summary API< 1,000 msAggregation of all assets, cached for 1 minute
SSE event delivery< 200 msFrom Python event to browser render

Metrics are collected via Prometheus-compatible endpoints (/metrics on each service) and visualized in Grafana dashboards. Alerts fire when P95 latency exceeds 2x the target for 5 consecutive minutes.

The request lifecycle spans three services. Correlation works as follows:

Browser: fetch("/dashboard")
|
| Cookie: session_id=abc123
v
SvelteKit BFF:
| Generate request_id: req_a1b2c3d4
| Log: { request_id, user_id, path, ... }
| Set header: X-Request-Id: req_a1b2c3d4
v
Python FastAPI:
| Read header: X-Request-Id: req_a1b2c3d4
| Log: { request_id, endpoint, duration, ... }
| Set PostgreSQL: SET app.request_id = 'req_a1b2c3d4'
v
PostgreSQL:
| Audit log row: { request_id: req_a1b2c3d4, ... }

A single request_id links the browser request, the BFF log entry, the Python log entry, and the database audit record. When an engineer reports “the dashboard showed the wrong state for P-101 at 2 PM,” the support team searches for the request ID and reconstructs exactly what data was returned and why.


The BFF and security architecture is not a feature of RAPID AI — it is the reason RAPID AI can be deployed in production industrial environments where diagnostic decisions affect physical safety. The key design decisions:

  1. The Python backend is never exposed to the public internet. The SvelteKit BFF is the sole entry point for browser traffic, enforcing authentication, validation, and rate limiting before any request reaches the diagnostic engine.

  2. Authentication is server-side only. httpOnly cookies, server-side sessions, and internal JWTs. No tokens in localStorage, no credentials in the browser.

  3. Defense in depth across four layers. SvelteKit middleware, SvelteKit server routes, FastAPI middleware, and PostgreSQL RLS. No single layer failure compromises the system.

  4. Dual-mode deployment lets the same diagnostic engine serve browser users (Mode A, through the BFF) and enterprise integrations (Mode B, direct API access with API keys).

  5. Every diagnostic decision is auditable. Immutable audit logs with end-to-end request IDs link browser actions to database records.

  6. The system degrades gracefully. Cached data, stale indicators, and health check polling ensure that engineers always see something useful, even when components fail.

The next chapter addresses the operational lifecycle: how RAPID AI is built, tested, deployed, and maintained in production.


StandardRelevance to This Chapter
OWASP Top 10 — Web application securityThe BFF security architecture addresses all OWASP Top 10 categories: A01 (Broken Access Control via RBAC), A02 (Cryptographic Failures via TLS/JWT), A03 (Injection via safe rule parser), A04 (Insecure Design via BFF pattern), A05 (Security Misconfiguration via environment-based config), A07 (Auth Failures via better-auth), and A09 (Logging via immutable audit trails).
IEC 62443 — Industrial cybersecurityThe zone-and-conduit architecture (browser zone, BFF zone, engine zone) implements IEC 62443’s network segmentation requirements with strict firewall rules ensuring the diagnostic engine is never exposed to the public internet.
NIST SP 800-82 — Guide to Industrial Control Systems securityThe two deployment modes (cloud-hosted and on-premise) follow NIST SP 800-82 guidance for securing industrial systems in both enterprise IT and operational technology environments.
VersionDateAuthorChanges
2.1.02026-03-17Rick DAdded standards alignment, living doc metadata, changelog
2.0.02026-03-17Rick DEnriched with production codebase content
1.0.02026-03-17Rick DInitial chapter creation