Skip to content

Data Architecture

RAPID AI’s data architecture serves a system that must store raw sensor signals, evaluate hundreds of engineering rules against them, track stability over time, and produce explainable diagnostic outputs. This chapter documents how the data is organized, seeded, queried, and governed.


RAPID AI operates on three interconnected schema stacks. Each was designed for a different analytical concern, but together they form the complete data foundation for the platform.

The normalized schema is the operational backbone. It contains the Integrated Master Schema (IMS) — the domain knowledge that RAPID AI uses to evaluate equipment health. Eight tables, each seeded with exactly 100 rows, provide the core reference data.

TablePurposeKey Columns
asset_masterEquipment templates — one row per asset configurationasset_template_id, asset_type, subsystem, component, criticality
functional_failuresWhat can go wrong at a functional levelfunctional_failure_id, asset_type, function_description, failure_description
failure_modesSpecific failure mechanismsfailure_mode_id, asset_type, failure_mode, failure_effect
sensor_evidence_rulesWhat sensor patterns indicate which failuressensor_rule_id, asset_type, sensor_type, evidence_description, confidence_rule
rcm_rulesReliability-centered maintenance decision logicrcm_rule_id, asset_type, failure_mode, recommended_strategy
maintenance_tasksSpecific maintenance actions and proceduresmaintenance_task_id, asset_type, task_description, skill_required
dashboard_output_mappingsHow diagnostic results map to dashboard KPIsdashboard_output_id, asset_type, kpi_widget, chart_type
schema_relation_mapForeign-key bridge connecting all seven tables aboveims_id, references to all other table IDs

The schema_relation_map table is the linchpin. Each of its 100 rows (IMS001 through IMS100) connects one asset template to its corresponding functional failure, failure mode, sensor rule, RCM rule, maintenance task, and dashboard mapping. This is how the system answers the chain: “for this equipment type, what can fail, how would we detect it, and what should we do?”

A ninth table, table_dictionary, stores metadata about the eight tables themselves — their names, column counts, and descriptions. It contains 8 rows, one per table.

The reliability schema, defined in Module D (Prognostics), extends the data model into statistical and temporal analysis territory. It is organized into four layers:

Master / Reference (7 tables):

  • area_master — logical plant sections (areas, units)
  • asset_master — asset registry for reliability analysis
  • asset_tag — normalized tag lists per asset
  • dependency_graph — graph definitions for connected-equipment analysis
  • dependency_node — one node per asset in each graph
  • dependency_edge — directed, weighted dependency edges between nodes
  • metric_catalog — catalog of supported reliability metrics (27 metrics across 7 groups: reliability, maintainability, availability, Weibull, growth, dependency risk, data quality)

Control (2 tables):

  • analysis_request — submitted reliability analysis runs
  • analysis_request_target — targets included in each request (asset, area, or portfolio scope)

Event / History (5 tables):

  • failure_event — failures, degradations, trips, shutdowns, restorations
  • runtime_interval — operating and downtime exposure intervals (basis for MTBF)
  • maintenance_event — work history, optionally linked to triggering failure events
  • sensor_evidence_summary — condition evidence summaries over time windows
  • sensor_inferred_failure_mode — inferred failure modes from sensor evidence

Result (4 tables):

  • reliability_result_asset — asset-level result envelopes
  • reliability_result_area — area-level result envelopes
  • metric_result — normalized metric values (one row per result-metric pair)
  • quality_flag_result — data sufficiency and confidence flags

The schema follows 3NF normalization principles. Result metrics are stored in a normalized metric_result table rather than as fixed columns, making the model extensible for future metrics — additional Weibull outputs, maintenance burden variants, or customer-specific KPIs can be added without schema migration.

Relationship Between Normalized and Reliability Schemas

Section titled “Relationship Between Normalized and Reliability Schemas”

The normalized schema answers what can fail and what to do about it — it is the knowledge layer. The reliability schema answers when and how likely — it is the analytical layer.

They connect through two shared concepts:

  1. Asset identity: Both schemas have asset_master tables, though with different column sets. The normalized schema’s asset_master stores equipment templates (19 asset types), while the reliability schema’s asset_master stores specific installed assets with area references and parent hierarchies.

  2. Failure modes: The normalized schema’s failure_modes table maps failure mechanisms to equipment types and sensor evidence. The reliability schema’s failure_event table records actual failure occurrences with timestamps. The sensor_inferred_failure_mode table bridges the two — it references sensor evidence summaries and maps them to failure mode candidates.

Currently, these stacks use incompatible primary key strategies (string IDs like FM0001 in the normalized schema, varchar(64) IDs like AST-PUMP-001 in the reliability schema). A bridge table or unified PK convention is needed before they can be joined in production queries.

The full production schema (documented in the Database Schema Pack) defines 34 tables organized into seven entity groups:

  1. Tenant and usertenants, users (multi-tenant boundary)
  2. Asset and dependencyplants, asset_types, assets, asset_dependencies, sensors
  3. Telemetry and observationtelemetry_events, waveform_files, manual_observations, maintenance_events, feature_snapshots, anomaly_events
  4. Diagnostic and reasoningdiagnostic_runs, diagnostic_hypotheses, evidence_items
  5. Rule and knowledgeengineering_rules, rule_execution_logs, failure_modes, imperfections, imperfection_findings, contradictions, contradiction_findings
  6. Reliability and planningreliability_assessments, root_cause_chains, design_recommendations, maintenance_plans, maintenance_plan_tasks, spare_recommendations, manpower_recommendations
  7. Governancecopilot_sessions, copilot_messages, rule_approvals, knowledge_audit_logs

The diagnostic run is the central transaction. A single diagnostic_runs row spawns hypotheses, evidence items, imperfection findings, contradiction findings, root cause chains, reliability assessments, and maintenance plans. This star structure makes it possible to trace any recommendation back through the full reasoning chain to the original sensor data.


The Integrated Master Schema (IMS) is the flattened, denormalized view of the eight normalized tables. Each of its 100 rows represents a complete diagnostic pathway — from an equipment template through its failure mode, sensor evidence, RCM decision, and maintenance action, all the way to its dashboard output mapping.

The 34 columns break into logical groups:

GroupColumnsPurpose
Asset identityasset_template_id, asset_type, subsystem, component, criticalityWhat equipment are we talking about?
Functional contextfunction_description, failure_descriptionWhat function does it serve, and what does failure look like?
Failure mechanismfailure_mode, failure_effect, severity_codeHow specifically does it fail?
Sensor evidencesensor_type, evidence_description, confidence_rule, detection_methodWhat sensor readings indicate this failure?
RCM decisionrecommended_strategy, task_interval, decision_basisWhat maintenance strategy does RCM prescribe?
Maintenance actiontask_description, skill_required, estimated_durationWhat specific work needs to be done?
Dashboard outputkpi_widget, chart_type, alert_threshold, display_priorityHow does this appear on screen?
Cross-referencesims_id, all foreign key IDsHow do the normalized tables link?

Loading the IMS: The ReferenceLoader Pattern

Section titled “Loading the IMS: The ReferenceLoader Pattern”

The IMS is loaded once at application startup and cached for the lifetime of the process. The ReferenceLoader pattern works as follows:

  1. On first access: Read all normalized CSV files (or query the database), join them via schema_relation_map, and build an in-memory lookup structure keyed by asset_type and failure_mode_id.
  2. Lazy caching: The loaded data is stored as an immutable dictionary. Subsequent lookups hit the cache, not the database.
  3. Reload on demand: The /admin/schema/reload endpoint invalidates the cache and forces a fresh load — useful after adding new IMS rows.

This pattern keeps diagnostic rule evaluation fast (sub-millisecond lookups) while avoiding the overhead of per-request database queries for reference data that changes infrequently.

The IMS is designed to grow by adding rows, not code. To add support for a new failure mode:

  1. Add one row to asset_master (if the equipment type is new)
  2. Add one row to functional_failures (what function fails)
  3. Add one row to failure_modes (the specific mechanism)
  4. Add one row to sensor_evidence_rules (how to detect it)
  5. Add one row to rcm_rules (what maintenance strategy to apply)
  6. Add one row to maintenance_tasks (what work to perform)
  7. Add one row to dashboard_output_mappings (how to display it)
  8. Add one row to schema_relation_map (link them all together)

No Python code changes. No deployment. The next /admin/schema/reload call picks up the new row, and the diagnostic engine can immediately evaluate it. This is the meaning of “rules are data, not code.”


320 Failure Modes Across 8 Equipment Types

Section titled “320 Failure Modes Across 8 Equipment Types”

The seed data provides a working diagnostic knowledge base from day one. The 320 failure modes are distributed across 8 equipment types:

Equipment TypeModesFM ID Range
Bearing60FM0001-FM0020, FM0141-FM0160, FM0281-FM0300
Pump60FM0021-FM0040, FM0161-FM0180, FM0301-FM0320
Electric Motor40FM0041-FM0060, FM0181-FM0200
Gearbox40FM0061-FM0080, FM0201-FM0220
Fan30FM0081-FM0095, FM0221-FM0235
Compressor30FM0096-FM0110, FM0236-FM0250
Conveyor30FM0111-FM0125, FM0251-FM0265
Hydraulic System30FM0126-FM0140, FM0266-FM0280

Bearings and pumps receive the heaviest coverage (60 modes each) because they are the most common rotating equipment in industrial plants and the most frequent targets for vibration-based diagnostics.

SQL Insertion Order (Referential Integrity)

Section titled “SQL Insertion Order (Referential Integrity)”

The seed data must be loaded in a specific order to satisfy foreign key constraints. The master runner script (00_run_all_seed_inserts.sql) enforces this:

BEGIN;
1. asset_master -- equipment templates (no dependencies)
2. functional_failures -- references asset types
3. failure_modes -- references asset types
4. sensor_evidence_rules -- references asset types and failure modes
5. rcm_rules -- references failure modes and strategies
6. maintenance_tasks -- references asset types and strategies
7. dashboard_output_mappings -- references asset types and KPIs
8. schema_relation_map -- references all seven tables above
9. table_dictionary -- metadata about the tables themselves
COMMIT;

The entire seed load is wrapped in a single transaction. If any INSERT fails (column-shift defect, duplicate key, FK violation), the entire batch rolls back cleanly. This prevents partial loads that would leave the IMS in an inconsistent state.

The seed_insert_manifest.csv tracks what each SQL file contains:

TableSource CSVOutput SQLRow Count
asset_masterasset_master.csvasset_master_seed_inserts.sql100
functional_failuresfunctional_failures.csvfunctional_failures_seed_inserts.sql100
failure_modesfailure_modes.csvfailure_modes_seed_inserts.sql100
sensor_evidence_rulessensor_evidence_rules.csvsensor_evidence_rules_seed_inserts.sql100
rcm_rulesrcm_rules.csvrcm_rules_seed_inserts.sql100
maintenance_tasksmaintenance_tasks.csvmaintenance_tasks_seed_inserts.sql100
dashboard_output_mappingsdashboard_output_mappings.csvdashboard_output_mappings_seed_inserts.sql100
schema_relation_mapschema_relation_map.csvschema_relation_map_seed_inserts.sql100
table_dictionarytable_dictionary.csvtable_dictionary_seed_inserts.sql8

Note the discrepancy: the normalized schema seeds 100 failure modes, while the full master library provides 320. The additional 220 modes (FM0101-FM0320) exist in failure_modes_full_320_seed.sql but do not yet have corresponding sensor evidence rules, RCM rules, maintenance tasks, or dashboard mappings. Extending the normalized schema to cover all 320 modes is a documented gap.


Engineers do not query diagnostic systems with SQL. They say things like “my pump is vibrating and running hot” or “bearing noise getting worse after last overhaul.” These natural language descriptions need to be matched against the IMS’s 100 sensor evidence descriptions and 320 failure mode names.

Traditional keyword search fails here because the vocabulary is inconsistent — an engineer might say “shaking” when the IMS says “elevated RMS velocity,” or “overheating” when the evidence rule says “bearing temperature rise above baseline.”

Vector similarity search solves this by converting both the query and the IMS descriptions into high-dimensional embeddings, then finding the closest matches by cosine distance. The query does not need to use the exact words in the evidence description — it needs to be semantically close.

The embedding pipeline works on rule and evidence descriptions:

  1. Source text: Each sensor_evidence_rules.evidence_description and failure_modes.failure_mode value is concatenated with its asset_type to form a composite text.
  2. Embedding model: Text is converted to vectors using an embedding model (e.g., OpenAI text-embedding-3-small at 1536 dimensions, or a local model for air-gapped deployments).
  3. Storage: Vectors are stored in a pgvector column alongside the source row, enabling similarity queries without a separate vector database.
  4. Indexing: An IVFFlat or HNSW index on the vector column enables sub-second similarity search across the full IMS.

Embeddings are generated once when seed data is loaded and regenerated on /admin/schema/reload. Since the IMS changes infrequently (adding rows, not updating existing ones), embedding freshness is not a concern.

A typical semantic search flow:

User input: "my pump is vibrating and running hot"
|
v
Embed the query text -> [0.012, -0.034, 0.089, ...]
|
v
SELECT failure_mode_id, failure_mode, asset_type,
1 - (embedding <=> query_embedding) AS similarity
FROM failure_modes
WHERE asset_type = 'Pump'
ORDER BY embedding <=> query_embedding
LIMIT 5;
|
v
Results:
1. FM0023 "Bearing overheating" (similarity: 0.87)
2. FM0025 "Seal failure with temperature rise" (similarity: 0.82)
3. FM0021 "Cavitation" (similarity: 0.71)

The results feed into the diagnostic engine as candidate failure modes, bypassing the need for the engineer to know the exact FM ID or navigate the IMS manually. This is the foundation for the Engineering Copilot — a natural language interface to RAPID AI’s rule-based reasoning.


The telemetry_events table captures all sensor data flowing into RAPID AI:

ColumnTypePurpose
telemetry_idUUIDPrimary key
tenant_idUUIDMulti-tenant isolation
asset_idUUIDWhich equipment
sensor_idUUIDWhich sensor (nullable for manual readings)
event_timetimestamptzWhen the reading was taken (indexed)
metric_codetextWhat was measured (rms, temp, rpm, pressure)
metric_valuenumericThe measurement value
unittextEngineering unit (mm/s, degrees C, RPM)
quality_codetextData quality flag (valid, suspect, missing)
ingestion_sourcetextHow it arrived (stream, file, manual)

This design uses a narrow-table pattern: each reading is one row with a metric_code discriminator, rather than one wide row with a column per metric. This accommodates assets with different sensor configurations without schema changes — a pump with 3 sensors and a gearbox with 7 sensors both write to the same table.

At production scale, telemetry_events will be the largest table by orders of magnitude. Two partitioning strategies apply:

By time (primary): Monthly range partitions on event_time. This aligns with the most common query pattern — “show me the last 30 days of readings for this asset” — and enables efficient partition pruning. Old partitions can be detached and moved to cold storage without affecting current queries.

By asset (secondary): Within each time partition, a hash partition on asset_id distributes data evenly across storage. This prevents hot spots when a high-frequency sensor floods the system with readings.

TierRetentionGranularityStorage
Hot90 daysRaw readings (every sample)Primary PostgreSQL
Warm1 yearHourly aggregates (min, max, avg, stddev)Partitioned PostgreSQL
Cold5+ yearsDaily aggregatesObject storage (Parquet files)

Raw waveform data (time-series arrays from accelerometers) is never stored in the relational database. It goes to object storage via the waveform_files table, which stores only the metadata (storage URI, sampling rate, duration) alongside a reference to the asset and sensor.


Module A (Signal Validation) is the first module in the pipeline, and its primary purpose is to prevent bad data from entering the system. It applies 19 GUARD rules (DG001-DG019) that check for:

  • NaN values in sensor readings
  • Flatline signals (sensor stuck at one value)
  • Signal clipping (amplitude exceeding sensor range)
  • Sampling rate violations
  • Unit mismatches
  • Excessive noise floors

Each GUARD rule produces either a hard block (signal rejected, downstream modules receive nothing) or a soft penalty (signal accepted with a reduced quality score). The quality score is a multiplicative product of all triggered penalties:

quality_score = product(penalty_i for each triggered rule)

A signal with no issues scores 1.0. A signal with two soft penalties of 0.8 and 0.9 scores 0.72. This score propagates downstream through every subsequent module’s confidence calculation.

How Bad Data Degrades Confidence Downstream

Section titled “How Bad Data Degrades Confidence Downstream”

Confidence propagation is the mechanism by which Module A’s quality assessment affects every downstream result. The formula:

Canonical reference: See Chapter 6 for the authoritative confidence propagation formula.

C_final = Q_data x (1 - product(1 - C_i))

Where Q_data is Module A’s quality score and C_i are the individual rule confidence scores from Modules B through E. If Q_data = 0.72 (marginal signal quality), even a perfectly matched failure mode rule with C_i = 0.95 produces a final confidence of only 0.72 x 0.95 = 0.68. The system is honest about uncertainty — it will not claim high confidence from low-quality data.

This design means that RAPID AI degrades gracefully. A noisy sensor does not produce false alarms — it produces lower-confidence diagnoses that explicitly flag the data quality concern.

Data Validation Rules at the Database Level

Section titled “Data Validation Rules at the Database Level”

Beyond Module A’s runtime checks, the database schema enforces structural integrity:

Temporal constraints:

  • analysis_request.end_ts >= analysis_request.start_ts
  • runtime_interval.end_ts > runtime_interval.start_ts
  • failure_event.failure_end_ts >= failure_event.failure_start_ts
  • sensor_evidence_summary.window_end_ts > sensor_evidence_summary.window_start_ts

Referential constraints:

  • Every metric_result row must have exactly one parent — either asset_result_id or area_result_id, never both, never neither.
  • Every dependency_edge must reference source and target nodes within the same graph_id.
  • Every metric_result.metric_code must exist in metric_catalog.

Quality tracking:

  • The quality_code column on telemetry_events (valid/suspect/missing) enables downstream queries to filter or weight readings by data quality.
  • The quality_flag_result table stores data sufficiency and confidence outputs separately from metric results, making it possible to audit why a particular analysis produced low-confidence results.

These constraints ensure that even if application-level validation fails, the database will reject structurally invalid data. Defense in depth — the same principle that makes RAPID AI’s diagnostic pipeline reliable also governs its data layer.


Next: Chapter 16 — Roadmap


StandardRelevance to This Chapter
ISO 14224 — Reliability and maintenance dataThe normalized schema (8 tables x 100 rows) and reliability schema (18 tables) implement ISO 14224’s standardized data structure for equipment taxonomy, failure modes, maintenance events, and reliability metrics.
ISO 13374 — Condition monitoring and diagnostics of machinesThe data architecture supports ISO 13374’s processing chain by providing structured storage for each processing level’s inputs and outputs, from raw sensor data through diagnostic results to advisory outputs.
MIMOSA OSA-CBM — Open System Architecture for CBMThe schema_relation_map table implements OSA-CBM’s data interoperability requirements by maintaining traceable links between asset hierarchy, failure modes, detection rules, and maintenance actions.
VersionDateAuthorChanges
2.1.02026-03-17Rick DAdded standards alignment, living doc metadata, changelog
2.0.02026-03-17Rick DEnriched with production codebase content
1.0.02026-03-17Rick DInitial chapter creation