Skip to content

System Architecture

RAPID AI is built as a monolith-first, API-first engineering intelligence platform. The decision to avoid a microservices architecture at this stage is deliberate: microservices impose network boundaries, deployment orchestration, and distributed debugging overhead that a young product cannot afford. The domain logic of industrial diagnostics is complex enough without scattering it across a dozen independently failing processes.

Instead, RAPID AI adopts a modular monolith. The codebase is organized into concentric rings with strict dependency direction — inner rings know nothing about outer rings — so that services can be extracted later if scale demands it, but today they communicate through function calls, not network hops.

The platform consists of three runtime services:

FastAPI Engine (Python 3.13, port 8000). This is the physics pipeline, the diagnostic intelligence, the swarm coordination, the knowledge search, and every computation that touches RAPID AI’s intellectual property. It runs on uvicorn, uses Pydantic for schema validation, and connects to PostgreSQL via SQLAlchemy Core for read/write operations. The engine owns Ring 0 (pure physics — no I/O) and Ring 1 (orchestration), with Ring 2 (infrastructure) handling all external boundaries: database, AI providers, and agent tools.

SvelteKit Explorer (Bun 1.3, port 5173). This is the consolidated frontend application — previously five separate apps (Explorer, Invent, Monitor, Suite, Workshop), now unified into a single SvelteKit application with route groups for isolation. It serves as both the user-facing dashboard and the backend-for-frontend (BFF) layer that proxies API calls, manages authentication via better-auth, and owns the PostgreSQL schema through Drizzle ORM.

PostgreSQL 17 with pgvector (port 5432). A single database instance serves as the single source of truth (SSOT) for asset hierarchy, analysis results, vector embeddings, sensor readings, work orders, failure history, alerts, and audit logs. pgvector enables semantic similarity search over 768-dimensional embeddings without requiring a separate vector database.

Every capability in the system is exposed as a REST endpoint. The API prefix is /rapid-ai/v1/. This API-first discipline ensures that the SvelteKit frontend, future mobile clients, CMMS integrations, and third-party consumers all access the same capabilities through the same contracts.

Schema ownership follows a strict rule: Drizzle owns the DDL. The SvelteKit application manages all table creation, migrations, and schema changes through drizzle-kit. Python mirrors the schema through SQLAlchemy Table objects that are used for type-safe queries but never create or alter tables. This prevents two ORMs from fighting over the same database.


+------------------------------------------------------------------+
| CLIENT LAYER |
| |
| +------------------+ +------------------+ +----------+ |
| | SvelteKit | | API Consumers | | CMMS | |
| | Explorer | | (REST clients, | | Export | |
| | Dashboard | | mobile, CLI) | | Targets | |
| +--------+---------+ +--------+---------+ +----+-----+ |
| | | | |
+------------|------- --------------|--------------------|---------+
| | |
v v v
+------------------------------------------------------------------+
| API GATEWAY LAYER (FastAPI) |
| Prefix: /rapid-ai/v1/ |
| |
| Auth (better-auth) | Rate Limiting | API Versioning |
| Feature Flags | CORS | Request Validation |
| |
| +-----------+ +-----------+ +-----------+ +-------------+ |
| | /evaluate | | /diagnose | | /assets/* | | /swarm/* | |
| | /moduleA | | /copilot | | /health | | /knowledge | |
| | /moduleB | | | | | | /stream | |
| | /moduleC | | | | | | | |
| | /moduleD | | | | | | | |
| | /moduleE | | | | | | | |
| +-----------+ +-----------+ +-----------+ +-------------+ |
+------------------------------------------------------------------+
|
v
+------------------------------------------------------------------+
| SERVICE LAYER |
| |
| +-------------------------------------------------------------+ |
| | AnalysisService (Pipeline) | |
| | | |
| | mA: analyze_signal() | |
| | Guard rules (DG001-DG019) --> Signal features | |
| | (RMS, peak, crest, kurtosis) --> ISO zone classification | |
| | | | |
| | v | |
| | mB: detect_faults() | |
| | Component rules (121 rules, 12 types) | |
| | Signal feature rules (SF001-SF051) | |
| | Trend analysis (Step/Chaotic/Accel/Drift/Stable) | |
| | SEDL entropy (SE + TE + DE) | |
| | | | |
| | v | |
| | mC: fuse_ssi() | |
| | BSR001-BSR022 block scoring (3-pass) | |
| | SSI weighted aggregation + system profile weights (YAML) | |
| | | | |
| | v | |
| | mD: predict_prognostics() | |
| | HSR001-HSR010 health stage rules | |
| | RUL estimation (Weibull-adjusted log-slope) | |
| | | | |
| | v | |
| | mE: plan_maintenance() | |
| | PWR001-PWR010 priority windows (2-pass) | |
| | Priority = 100 x (0.45S + 0.25C + 0.20K + 0.10U) | |
| | Action ranking with boost | |
| +-------------------------------------------------------------+ |
| |
| +-----------------+ +-----------------+ +------------------+ |
| | Swarm Engine | | AI Diagnostician| | Knowledge (RAG) | |
| | EngineSentinel | | Agent loop | | Rule embeddings | |
| | TaskPlanner | | 5 tools, max 5 | | Analysis vectors | |
| | 4 Workers: | | iterations | | Cosine search | |
| | Analyst | | FRETTLSM-aware | | | |
| | Diagnostician | | | | | |
| | BriefWriter | | | | | |
| | Knowledge | | | | | |
| +-----------------+ +-----------------+ +------------------+ |
| |
| +------------------+ +-------------------+ |
| | Confidence | | Provider Registry | |
| | Scoring Module | | Gemini > OpenAI > | |
| | 0.0-1.0 range | | Cloudflare > | |
| | canonical std | | Template fallback | |
| +------------------+ +-------------------+ |
+------------------------------------------------------------------+
|
v
+------------------------------------------------------------------+
| DATA LAYER |
| |
| +-------------------------+ +------------------------------+ |
| | PostgreSQL 17 + pgvector| | Object Storage (future) | |
| | | | Waveforms, attachments, | |
| | Asset hierarchy: | | inspection images | |
| | organization | +------------------------------+ |
| | locations | |
| | areas | +------------------------------+ |
| | equipment | | Rule Store (YAML) | |
| | sub_assemblies | | actions.yaml | |
| | measurement_points | | profiles.yaml | |
| | spares | | block_scores.yaml | |
| | | | fusion.yaml | |
| | Intelligence: | +------------------------------+ |
| | analysis_results | |
| | analysis_vectors | +------------------------------+ |
| | rule_vectors | | Seed Data | |
| | sensor_readings | | IMS failure modes (119) | |
| | | | Signal feature rules (50) | |
| | Maintenance: | | Guard rules (16) | |
| | work_orders | | System profiles | |
| | pm_schedules | +------------------------------+ |
| | failure_history | |
| | | |
| | Observability: | |
| | alerts | |
| | audit_log | |
| +-------------------------+ |
+------------------------------------------------------------------+
|
v
+------------------------------------------------------------------+
| EXTERNAL INTEGRATIONS |
| |
| +-------------------+ +------------------+ +---------------+ |
| | Sensor Data | | CMMS Export | | Notification | |
| | Ingestion | | Work orders, | | Services | |
| | POST /evaluate | | spare requests, | | SSE stream, | |
| | POST /data/sensor | | maintenance logs | | alerts, | |
| | | | | | webhooks | |
| +-------------------+ +------------------+ +---------------+ |
| |
| +-------------------+ +------------------+ |
| | AI Providers | | Embedding API | |
| | Gemini, OpenAI, | | 768-dim vectors | |
| | Cloudflare, local | | for RAG search | |
| +-------------------+ +------------------+ |
+------------------------------------------------------------------+

The database is a single PostgreSQL 17 instance with the pgvector extension. This is a deliberate simplicity: one database to back up, one connection string to configure, one set of migrations to manage.

Drizzle ORM defines the schema in TypeScript. The tables are organized into four functional groups:

Asset Hierarchy. The plant structure follows a strict parent-child chain: organization (owned by better-auth) > locations > areas > equipment > sub_assemblies, which then branch into measurement_points and spares. Each level carries its own metadata: locations have geographic coordinates, equipment has machine type and criticality scores, measurement points have signal type and direction, spares have stock quantities and lead times.

Analysis and Intelligence. analysis_results stores the complete pipeline output for every evaluation: the original request, the full response (as JSONB), the computed SSI, and the severity level. sensor_readings holds raw sensor data keyed by measurement point and timestamp, with values stored as JSONB to accommodate different sensor types without schema changes.

Maintenance. work_orders track maintenance tasks through their lifecycle (status, priority, assignee, parts used). pm_schedules define preventive maintenance intervals with both time-based and condition-based triggers. failure_history stores FMEA records with severity, occurrence, and detection ratings.

Observability. alerts capture system alerts with type, severity, and acknowledgment status. audit_log records every significant user action with the entity type, action taken, and a JSONB details payload for forensic analysis.

Two vector tables store 768-dimensional embeddings:

rule_vectors — Synchronized on startup when the RAG_RULES feature flag is enabled. Each of the 119 component fault rules is embedded as a document containing: the diagnosis, the underlying physics, the FRETTLSM root cause category, the severity progression from early to late stage, and the recommended corrective actions. Content-hash deduplication ensures zero redundant API calls after the initial sync.

analysis_vectors — Embedded as a background task after every successful POST /evaluate. The document includes: asset identity, health stage, SSI score, top faults detected, top recommended actions, and remaining useful life estimate.

Search is performed via GET /rapid-ai/v1/knowledge/search?q=...&top_k=5, which runs cosine similarity queries across both tables, merges the results, and returns them sorted by relevance. This creates a knowledge growth loop: every analysis enriches the search corpus, making future diagnoses more informed.

TablePK TypeKey ColumnsPurpose
organizationtextname, slugTop-level tenant
locationsuuidorganisation_id, country, geo_lat/lonPlant sites
areasuuidlocation_id, area_typeFunctional areas
equipmentuuidarea_id, machine_type, criticalityMachines
sub_assembliesuuidequipment_id, component_type, positionComponents
measurement_pointsuuidsub_assembly_id, direction, signal_typeSensor locations
sparesuuidsub_assembly_id, part_number, quantity_on_handInventory
analysis_resultsuuidrequest/response JSONB, ssi, severityPipeline outputs
sensor_readingsuuidmeasurement_point_id, timestamp, values JSONBRaw data
work_ordersuuidstatus, priority, assignee, parts_usedMaintenance tasks
failure_historyuuidfailure_mode, cause, severity/occurrence/detectionFMEA records

High-priority indexes target the most common query patterns:

  • sensor_readings(measurement_point_id, timestamp) — Time-series lookups by sensor, ordered by time. This is the most critical index in the system; every trend query, every baseline comparison, every dashboard chart hits it.
  • analysis_results(equipment_id, created_at) — Retrieving analysis history for a specific machine.
  • equipment(area_id, status) — Listing active machines in a plant area.
  • work_orders(equipment_id, status) — Finding open work orders for a machine.
  • rule_vectors and analysis_vectors use IVFFlat indexes on their vector columns for approximate nearest-neighbor search.

For time-series data at scale, PostgreSQL’s native table partitioning (by month on timestamp) is the planned approach before considering a dedicated time-series database. The principle: exhaust PostgreSQL before adding infrastructure.


All endpoints are prefixed /rapid-ai/v1/ and follow pragmatic REST conventions. The API surface is organized into five functional groups:

Physics Pipeline — The core diagnostic capability.

MethodPathPurpose
POST/evaluateFull 5-module pipeline (A > B > C > D > E)
POST/moduleASignal analysis only
POST/moduleBFault detection only
POST/moduleCSSI fusion only
POST/moduleDPrognostics only
POST/moduleEMaintenance planning only
POST/generate-signalSynthetic signal generation
POST/diagnoseAI diagnostician (feature-gated)

Asset Hierarchy — CRUD operations on plant structure.

MethodPathPurpose
GET/assets/organisationsList all organizations
GET/assets/locations/{org_id}Locations in an organization
GET/assets/areas/{location_id}Areas in a location
GET/assets/equipment/{area_id}Equipment in an area
GET/assets/equipment/{id}/contextEquipment with full context
GET/assets/sub-assemblies/{equipment_id}Sub-assemblies
GET/assets/measurement-points/{sub_id}Measurement points
GET/PATCH/assets/spares/*Spare parts and stock management

Swarm (Agent Coordination) — Multi-agent diagnostic intelligence.

MethodPathPurpose
POST/swarm/taskSubmit a pre-built AgentTask
POST/swarm/dispatchIntent-based dispatch (plans then executes)
GET/swarm/capabilitiesList all worker capabilities
GET/swarm/statusWorker count, active tasks
GET/swarm/streamSSE stream (heartbeats + events)

Knowledge (RAG) — Semantic search across rules and past analyses.

MethodPathPurpose
GET/knowledge/search?q=...&top_k=5Cosine similarity search

Health — System status and feature flag reporting.

MethodPathPurpose
GET/healthStatus, version, feature flags, providers

All request bodies are validated by Pydantic models in the engine’s domain layer. A typical pipeline evaluation request:

{
"equipment_id": "uuid-of-equipment",
"system_type": "centrifugal_pump",
"operating_speed_hz": 29.5,
"signal": {
"waveform": [0.12, -0.08, 0.15, ...],
"sampling_rate_hz": 8192,
"signal_type": "acceleration"
}
}

The response returns the full pipeline output: Module A features, Module B fault detections, Module C SSI score, Module D health stage and RUL, Module E maintenance actions — all in a single FullAnalysisResponse object. After the response is sent, a background task embeds the result into pgvector for future RAG search.

The current version prefix /rapid-ai/v1/ is baked into all routes. When breaking changes are introduced, a /v2/ prefix will run alongside /v1/ with a deprecation window. Non-breaking additions (new fields, new endpoints) are added to the current version without incrementing.


The service layer is the intellectual core of RAPID AI. It lives in engine/domain/ (Ring 0, pure physics, no I/O) and engine/services/ (Ring 1, orchestration). Here is what each engine does and how.

Rule Evaluator: Safe Recursive-Descent Parser

Section titled “Rule Evaluator: Safe Recursive-Descent Parser”

RAPID AI contains 119 component fault rules, 50 signal feature rules, and 16 data guard rules. These rules are not executed via eval() or any dynamic code interpretation. They are expressed as structured condition/action pairs — either as Python dataclasses or YAML configuration — and evaluated by a safe recursive-descent evaluator that walks condition trees without ever executing arbitrary code.

The guard rules (DG001-DG019) run first, checking data quality before any diagnostic computation begins. If the signal fails data validation — missing samples, unreasonable amplitudes, sampling rate below Nyquist — the pipeline stops early with an explanation, not a crash.

Diagnostic Engine: IMS-Driven, Data-Not-Code

Section titled “Diagnostic Engine: IMS-Driven, Data-Not-Code”

The Integrated Master Schema (IMS) is a database of 119 failure mode signatures across 12 component types (antifriction bearings, gears, journal bearings, motors, pumps, fans, compressors, couplings, shafts, seals, structures, and belts). Each entry maps a failure mechanism to its expected spectral signature, FRETTLSM root cause category, severity progression, and recommended corrective actions.

The diagnostic engine does not hardcode diagnostic logic. It reads IMS entries and compares them against extracted signal features. When sensor features match an IMS pattern — for example, elevated BPFO harmonics with increasing kurtosis — the engine scores the match and ranks it by confidence. The intelligence is in the data, not the code. Adding a new failure mode means adding an IMS row, not writing new software.

SEDL Engine: Shannon Entropy Across Three Domains

Section titled “SEDL Engine: Shannon Entropy Across Three Domains”

The Spectral-Entropy-Directional-Lens (SEDL) applies information-theoretic analysis to detect stability degradation before threshold-based alarms fire. It computes entropy across three domains:

  • Spectral Entropy (SE): Measures disorder in the frequency spectrum. A healthy machine concentrates energy in narrow bands; a degrading machine spreads energy across wider bands, increasing spectral entropy.
  • Temporal Entropy (TE): Measures disorder in the time-domain signal. Erratic amplitude variation indicates instability.
  • Directional Entropy (DE): Measures disorder across measurement axes. When vibration energy shifts unpredictably between horizontal, vertical, and axial directions, it signals mechanical instability.

SEDL runs as part of Module B (fault detection) and feeds into Module C (fusion).

Fusion Engine: Profile-Weighted Block Aggregation

Section titled “Fusion Engine: Profile-Weighted Block Aggregation”

Module C (fuse_ssi) computes the System Stability Index through a three-pass block scoring process using rules BSR001 through BSR022. Each block represents a diagnostic dimension — spectral health, trend stability, entropy state, component-specific risk. Blocks are scored individually, then aggregated using weights defined in profiles.yaml that vary by system type. A centrifugal pump has different weight profiles than a gearbox.

The SSI is not an average. It is a weighted aggregation that emphasizes the most diagnostically significant dimensions for each machine type. The formula for the final priority score in Module E is:

Canonical reference: See Chapter 26 for the authoritative priority formula.

P = 100 x (0.45 x Severity + 0.25 x Criticality + 0.20 x Kurtosis_factor + 0.10 x Urgency)

RUL Engine: Three Weibull/Log-Slope Models

Section titled “RUL Engine: Three Weibull/Log-Slope Models”

Module D estimates Remaining Useful Life through three parallel approaches:

  1. Weibull-adjusted decay — Maps the current health stage (from HSR001-HSR010 health stage rules) to a Weibull survival curve. The rul_multiplier from the matched health stage rule adjusts the baseline estimate.
  2. Log-slope trend projection — Extrapolates the rate of degradation from trend analysis to estimate when a threshold will be crossed.
  3. Envelope estimation — Provides optimistic and pessimistic bounds based on confidence intervals.

The three estimates are combined with weights that favor the trend-based projection when sufficient historical data exists.

CDE Engine: Two-Phase Trigger-Then-Evaluate

Section titled “CDE Engine: Two-Phase Trigger-Then-Evaluate”

The Contradiction-Driven Engineering engine identifies situations where improving one engineering parameter necessarily worsens another — the kind of trade-off that separates root cause treatment from symptom management.

Phase 1 (Trigger): The engine scans the diagnostic output for contradiction triggers — cases where a recommended action would create a new problem. For example, increasing bearing clearance to reduce thermal preload worsens the machine’s tolerance to unbalance forces.

Phase 2 (Evaluate): Once a contradiction is triggered, the engine retrieves the relevant contradiction template, evaluates the trade-off severity, and generates resolution alternatives ranked by engineering impact.

The causal engine classifies root causes using the FRETTLSM taxonomy developed by Dibyendu De:

LetterCategoryExamples
FForcePreload, misalignment, unbalance
RReactiveResonance, structural looseness
EEnvironmentContamination, corrosion
TTemperatureThermal expansion, overheating
TTribologySurface fatigue, pitting
LLubricationStarvation, wrong viscosity
SSurfaceSpalling, brinelling, erosion
MManWrong clearance, installation error

Each of the 121 rules is classified by FRETTLSM category. The causal engine matches diagnostic findings to their FRETTLSM classification and builds cause chains: trigger (the initiator) > accelerator (what made it worse) > retarder (what could have slowed it).

Confidence Module: Canonical Scoring Standard

Section titled “Confidence Module: Canonical Scoring Standard”

All confidence scores across the system follow a single canonical standard:

  • Format: Numeric float, range 0.0 to 1.0
  • Field name: confidence_score everywhere

Qualitative labels are mapped to fixed numeric values:

LabelValueApplication
High0.85Direct sensor confirmation
Medium-high0.75Strong single-source evidence
Medium0.60Ambiguous/single source
Low0.40Weak signal/noisy data
Insufficient0.00Contradictory evidence

Decision thresholds: RCM activation requires >= 0.70. Safety escalation requires >= 0.80. Dashboard display suppresses anything below 0.50 to prevent noise from reaching operators.


A complete diagnostic request traces the following path through the system:

1. Sensor data arrives via POST /rapid-ai/v1/evaluate
Request includes: equipment_id, system_type, operating_speed_hz,
signal waveform, sampling_rate_hz, signal_type.
2. _resolve_asset_context() enriches the request with machine_type
and criticality from the database.
3. Module A: analyze_signal()
- Guard rules (DG001-DG019) validate data quality.
- Signal features extracted: RMS, peak, crest factor, kurtosis.
- ISO zone classification applied.
- If data fails guard rules, pipeline stops with explanation.
4. Module B: detect_faults() [sequential evaluation]
- 119 component fault rules matched against features.
- 50 signal feature rules (SF001-SF051) evaluated.
- Trend analysis classifies pattern: Step/Chaotic/Accel/Drift/Stable.
- SEDL entropy computed: SE + TE + DE.
- Matched faults ranked by confidence score.
5. Module C: fuse_ssi()
- BSR001-BSR022 block scoring in 3 passes.
- System profile weights loaded from profiles.yaml.
- SSI (System Stability Index) computed as weighted aggregation.
6. Module D: predict_prognostics()
- HSR001-HSR010 health stage rules determine current stage.
- RUL estimated via Weibull-adjusted log-slope model.
- rul_multiplier from matched health stage adjusts baseline.
7. Module E: plan_maintenance()
- PWR001-PWR010 priority windows evaluated in 2 passes.
- Priority score computed: P = 100 x (0.45S + 0.25C + 0.20K + 0.10U).
- Actions ranked with boost factors for urgent conditions.
8. FullAnalysisResponse returned to client.
9. Background task: embed_analysis() generates a 768-dim vector
and stores it in analysis_vectors table (pgvector).
10. If SSI exceeds alert thresholds, an alert record is created.
SSE stream pushes event to connected Explorer clients.
11. Explorer dashboard polls /assets/{id}/context for updated
analysis history and displays trend charts.

The type contract chain ensures consistency from database to browser:

Python Pydantic (Ring 0) --> REST API (JSON) --> @rapidai/contracts (TypeScript) --> App imports
| |
settings.py (thresholds) hierarchy.ts, analysis.ts
hierarchy.py (schemas) base.ts, rules.ts
swarm.py (protocol) diagnostics-types.ts

Pydantic models are the source of truth. TypeScript contracts in @rapidai/contracts mirror the Pydantic shapes. The Explorer app imports only from the contracts package, never from the engine directly. This ensures that if a schema changes in Python, the TypeScript build breaks immediately — before a runtime error can reach a user.


RAPID AI is designed to run on a single machine today and scale horizontally when demand requires it.

Both the FastAPI engine and SvelteKit Explorer are stateless. No request depends on server-local state. Session data lives in PostgreSQL (via better-auth), analysis results live in PostgreSQL, configuration lives in YAML files baked into the container image. This means multiple engine instances can run behind a load balancer with no session affinity required.

  • Read replicas: Dashboard queries (trend charts, asset lists, analysis history) are read-heavy and can be directed to a PostgreSQL read replica.
  • Connection pooling: SQLAlchemy async with asyncpg uses pool_size=10 and max_overflow=20. For higher concurrency, PgBouncer can be added as a connection multiplexer.
  • Partitioning: sensor_readings and analysis_results are candidates for time-based partitioning when row counts reach the tens of millions.
  • Graceful degradation: If DATABASE_URL is not set, the engine runs in pure-compute mode — the pipeline works, persistence does not. This allows the physics engine to function independently during database maintenance.
  • Rule store: The 119 component rules, 50 signal feature rules, and YAML configuration files (profiles, block scores, fusion weights, actions) are loaded into memory on startup. They change infrequently and are small enough to hold entirely in RAM.
  • IMS cache: Failure mode signatures are loaded once and cached for the lifetime of the process. A restart picks up any changes.
  • Vector embeddings: Rule vectors are synced on startup with content-hash deduplication. After the initial sync, subsequent startups require zero embedding API calls.
  • Trend analysis (Module B.2) and SEDL entropy (Module B.3) are evaluated sequentially within Module B. A previous ThreadPoolExecutor(max_workers=4) implementation was removed because the GIL prevents true parallelism for CPU-bound Python work.
  • Vector embedding after analysis is already a background task — the response returns to the client before the embedding is complete.
  • Swarm tasks: The swarm engine dispatches agent tasks asynchronously. Long-running diagnostic reasoning (up to 5 LLM iterations) runs without blocking the HTTP response thread.
  • SSE streaming: Server-sent events provide real-time updates to the dashboard without polling. The Explorer’s /api/stream route proxies the engine’s SSE stream and merges it with local event channels (alerts, sensors, health, swarm).

better-auth manages user authentication with admin and organization plugins. It creates and manages its own tables (user, session, account, verification, member, invitation). The Explorer’s auth.ts serves as the BFF bridge — the SvelteKit server validates sessions before proxying requests to the FastAPI engine.

Session tokens are signed with BETTER_AUTH_SECRET. In production, this must be a cryptographically random string, not the development default.

Role-based access control is enforced at the SvelteKit BFF layer:

RolePermissions
ViewerRead dashboards, view analysis history
EngineerRun analyses, create work orders, manage spares
AdminManage users, organizations, system configuration
ApproverApprove engineering rule changes (governance)

The FastAPI engine currently trusts requests from the SvelteKit server (internal network boundary). API key authentication for external consumers is planned for the multi-tenant phase.

  • TLS in transit: All external communication over HTTPS. Internal service-to-service communication (SvelteKit to FastAPI) uses HTTP within the Docker network; TLS termination at the load balancer for production.
  • Encryption at rest: Managed PostgreSQL providers (Supabase, Neon) encrypt storage by default. Self-hosted deployments should enable PostgreSQL’s native encryption or use encrypted volumes.
  • Secret management: API keys (GEMINI_API_KEY, OPENAI_API_KEY, CLOUDFLARE_API_TOKEN, BETTER_AUTH_SECRET) are injected via environment variables, never committed to source control.
  • Pydantic models: Every API input is validated by a Pydantic model before reaching the service layer. Invalid requests are rejected with structured error responses.
  • Parameterized SQL: All database queries use SQLAlchemy’s parameterized query builder. No string interpolation in SQL.
  • No eval(): Rule evaluation uses a safe recursive-descent parser. No user-supplied input is ever executed as code.
  • IP protection: Engineering rules are executed server-side only. The API returns diagnostic results and explanations, never the raw rules or their internal logic. Explainability summaries replace full logic disclosure.

Three services defined in infra/docker-compose.yml:

services:
db: pgvector/pgvector:pg17 :5432
engine: Dockerfile.api (Python 3.13) :8000 (depends: db)
app: Dockerfile.app (SvelteKit) :5173 (depends: db + engine)

The Dockerfile.api uses a three-stage build: python:3.13-slim base, install dependencies via uv, copy application code, run as non-root user with uvicorn. The Dockerfile.app builds on oven/bun:alpine.

Quick start for local development:

Terminal window
make dev-db # Start PostgreSQL in Docker
make dev-api # Start FastAPI engine (uvicorn at :8000)
make dev-web # Start SvelteKit app (Bun at :5173)
# Or all at once:
make dev # API + Explorer (DB must be running)

Database management:

Terminal window
make db-push # Push Drizzle schema to DB
make db-generate # Generate migration files
make db-migrate # Run pending migrations
make db-studio # Open Drizzle Studio for visual inspection

Option A: Simple container hosting — Docker Compose on a single VPS with nginx as reverse proxy and TLS terminator. Suitable for early customers and proof-of-concept deployments.

Option B: Container orchestration — Kubernetes or Docker Swarm for horizontal scaling. The stateless services scale horizontally; the database runs as a managed service (Supabase, Neon, or AWS RDS with pgvector).

Option C: Edge-compatible — For plants with restricted connectivity, the FastAPI engine can run locally with periodic sync to a central database. The system degrades gracefully: without DATABASE_URL, the physics pipeline still functions in pure-compute mode.

The SvelteKit Explorer can be deployed in three configurations:

  1. Self-hosted: Node.js/Bun process behind a reverse proxy. Required when the BFF layer needs direct PostgreSQL access.
  2. Vercel/Cloudflare Pages: SvelteKit’s adapter system supports edge deployment. The BFF functions run as serverless functions.
  3. Static export: For dashboard-only deployments that consume the API without server-side rendering.
ServiceMethodInterval
PostgreSQLpg_isready5s
EngineGET /health (urllib, no external deps)10s
Explorercurl http://localhost:5173/15s

The /health endpoint returns the engine version, all feature flag states, and available AI provider capabilities — giving operators a single endpoint to verify that the full stack is operational.

Engine (Python):

VariableRequiredPurpose
DATABASE_URLFor persistencePostgreSQL connection string
GEMINI_API_KEYFor AI featuresGoogle Gemini API key
OPENAI_API_KEYFallbackOpenAI API key
RAPID_AI_PROVIDER_CHAINNo (default: gemini,openai,cloudflare,template)AI provider priority
RAPID_FEATURE_*No (all default ON)Feature flags (set 0 to disable)
LOG_LEVELNo (default: info)Structured logging level

Explorer (SvelteKit):

VariableRequiredPurpose
DATABASE_URLYesPostgreSQL for Drizzle + better-auth
BETTER_AUTH_SECRETYesSession signing secret
ORIGINYesSvelteKit origin URL
RAPID_AI_ENGINE_URLYesPython API URL for BFF proxy

Several deliberate trade-offs shape this architecture:

Monolith over microservices. The diagnostic domain is deeply interconnected — Module B’s faults feed Module C’s fusion, which feeds Module D’s prognostics. Splitting these across network boundaries would add latency, complexity, and failure modes without adding value at current scale. The ring architecture ensures that extraction to services is possible later without rewriting domain logic.

PostgreSQL over specialized stores. A single PostgreSQL instance with pgvector handles relational data, vector search, and (via JSONB) semi-structured data. This eliminates the operational burden of managing a separate vector database, time-series database, or document store. When any single concern outgrows PostgreSQL, it can be extracted independently.

Drizzle over SQLAlchemy for schema ownership. The frontend team (SvelteKit) owns the schema because they are closest to the user-facing data model. Python mirrors the schema for queries but never modifies structure. This prevents migration conflicts between two ORMs.

YAML over database for rule configuration. The 119 fault rules, system profiles, and scoring criteria live in YAML files within the codebase. This makes them version-controlled, diff-able, and deployable alongside the code that interprets them. When rule governance requires runtime editing, a database-backed rule store can layer on top.

Feature flags over feature branches. Six feature flags (AI_BRIEF, AI_DIAGNOSTICIAN, RAG_RULES, V2_PIPELINE, SWARM_ENGINE, SWARM_EXPLORER) allow capabilities to be toggled at runtime via environment variables. This enables gradual rollout, A/B testing, and graceful degradation when external services (AI providers, database) are unavailable.

These decisions optimize for the current stage of the product: a small team building a deep product, where developer velocity and debuggability matter more than theoretical scalability. The architecture is designed so that every decision can be revisited without a rewrite.


StandardRelevance to This Chapter
ISO 13374 — Condition monitoring and diagnostics of machinesThe three-service architecture (FastAPI engine, SvelteKit Explorer, PostgreSQL) implements ISO 13374’s processing chain as a production system, with strict separation between data acquisition, processing, and presentation layers.
MIMOSA OSA-CBM — Open System Architecture for CBMThe API-first design with REST endpoints under /rapid-ai/v1/ follows OSA-CBM’s open architecture principles, enabling interoperability with existing plant historians, CMMS, and SCADA systems.
IEC 62443 — Industrial cybersecurityThe concentric ring architecture (Ring 0 pure physics, Ring 1 orchestration, Ring 2 infrastructure) implements IEC 62443’s defense-in-depth model with strict dependency direction and network boundary separation.
OWASP Top 10 — Web application securityThe schema ownership rule (Drizzle owns DDL) and the BFF pattern prevent common web application vulnerabilities by isolating the diagnostic engine from direct public internet exposure.
VersionDateAuthorChanges
2.1.02026-03-17Rick DAdded standards alignment, living doc metadata, changelog
2.0.02026-03-17Rick DEnriched with production codebase content
1.0.02026-03-17Rick DInitial chapter creation