Data Architecture
Chapter 15 — Data Architecture
Section titled “Chapter 15 — Data Architecture”RAPID AI’s data architecture serves a system that must store raw sensor signals, evaluate hundreds of engineering rules against them, track stability over time, and produce explainable diagnostic outputs. This chapter documents how the data is organized, seeded, queried, and governed.
The Data Model
Section titled “The Data Model”RAPID AI operates on three interconnected schema stacks. Each was designed for a different analytical concern, but together they form the complete data foundation for the platform.
Normalized Schema: 8 Tables x 100 Rows
Section titled “Normalized Schema: 8 Tables x 100 Rows”The normalized schema is the operational backbone. It contains the Integrated Master Schema (IMS) — the domain knowledge that RAPID AI uses to evaluate equipment health. Eight tables, each seeded with exactly 100 rows, provide the core reference data.
| Table | Purpose | Key Columns |
|---|---|---|
asset_master | Equipment templates — one row per asset configuration | asset_template_id, asset_type, subsystem, component, criticality |
functional_failures | What can go wrong at a functional level | functional_failure_id, asset_type, function_description, failure_description |
failure_modes | Specific failure mechanisms | failure_mode_id, asset_type, failure_mode, failure_effect |
sensor_evidence_rules | What sensor patterns indicate which failures | sensor_rule_id, asset_type, sensor_type, evidence_description, confidence_rule |
rcm_rules | Reliability-centered maintenance decision logic | rcm_rule_id, asset_type, failure_mode, recommended_strategy |
maintenance_tasks | Specific maintenance actions and procedures | maintenance_task_id, asset_type, task_description, skill_required |
dashboard_output_mappings | How diagnostic results map to dashboard KPIs | dashboard_output_id, asset_type, kpi_widget, chart_type |
schema_relation_map | Foreign-key bridge connecting all seven tables above | ims_id, references to all other table IDs |
The schema_relation_map table is the linchpin. Each of its 100 rows (IMS001 through IMS100) connects one asset template to its corresponding functional failure, failure mode, sensor rule, RCM rule, maintenance task, and dashboard mapping. This is how the system answers the chain: “for this equipment type, what can fail, how would we detect it, and what should we do?”
A ninth table, table_dictionary, stores metadata about the eight tables themselves — their names, column counts, and descriptions. It contains 8 rows, one per table.
Reliability Schema: 18 Tables
Section titled “Reliability Schema: 18 Tables”The reliability schema, defined in Module D (Prognostics), extends the data model into statistical and temporal analysis territory. It is organized into four layers:
Master / Reference (7 tables):
area_master— logical plant sections (areas, units)asset_master— asset registry for reliability analysisasset_tag— normalized tag lists per assetdependency_graph— graph definitions for connected-equipment analysisdependency_node— one node per asset in each graphdependency_edge— directed, weighted dependency edges between nodesmetric_catalog— catalog of supported reliability metrics (27 metrics across 7 groups: reliability, maintainability, availability, Weibull, growth, dependency risk, data quality)
Control (2 tables):
analysis_request— submitted reliability analysis runsanalysis_request_target— targets included in each request (asset, area, or portfolio scope)
Event / History (5 tables):
failure_event— failures, degradations, trips, shutdowns, restorationsruntime_interval— operating and downtime exposure intervals (basis for MTBF)maintenance_event— work history, optionally linked to triggering failure eventssensor_evidence_summary— condition evidence summaries over time windowssensor_inferred_failure_mode— inferred failure modes from sensor evidence
Result (4 tables):
reliability_result_asset— asset-level result envelopesreliability_result_area— area-level result envelopesmetric_result— normalized metric values (one row per result-metric pair)quality_flag_result— data sufficiency and confidence flags
The schema follows 3NF normalization principles. Result metrics are stored in a normalized metric_result table rather than as fixed columns, making the model extensible for future metrics — additional Weibull outputs, maintenance burden variants, or customer-specific KPIs can be added without schema migration.
Relationship Between Normalized and Reliability Schemas
Section titled “Relationship Between Normalized and Reliability Schemas”The normalized schema answers what can fail and what to do about it — it is the knowledge layer. The reliability schema answers when and how likely — it is the analytical layer.
They connect through two shared concepts:
-
Asset identity: Both schemas have
asset_mastertables, though with different column sets. The normalized schema’sasset_masterstores equipment templates (19 asset types), while the reliability schema’sasset_masterstores specific installed assets with area references and parent hierarchies. -
Failure modes: The normalized schema’s
failure_modestable maps failure mechanisms to equipment types and sensor evidence. The reliability schema’sfailure_eventtable records actual failure occurrences with timestamps. Thesensor_inferred_failure_modetable bridges the two — it references sensor evidence summaries and maps them to failure mode candidates.
Currently, these stacks use incompatible primary key strategies (string IDs like FM0001 in the normalized schema, varchar(64) IDs like AST-PUMP-001 in the reliability schema). A bridge table or unified PK convention is needed before they can be joined in production queries.
Entity-Relationship Overview
Section titled “Entity-Relationship Overview”The full production schema (documented in the Database Schema Pack) defines 34 tables organized into seven entity groups:
- Tenant and user —
tenants,users(multi-tenant boundary) - Asset and dependency —
plants,asset_types,assets,asset_dependencies,sensors - Telemetry and observation —
telemetry_events,waveform_files,manual_observations,maintenance_events,feature_snapshots,anomaly_events - Diagnostic and reasoning —
diagnostic_runs,diagnostic_hypotheses,evidence_items - Rule and knowledge —
engineering_rules,rule_execution_logs,failure_modes,imperfections,imperfection_findings,contradictions,contradiction_findings - Reliability and planning —
reliability_assessments,root_cause_chains,design_recommendations,maintenance_plans,maintenance_plan_tasks,spare_recommendations,manpower_recommendations - Governance —
copilot_sessions,copilot_messages,rule_approvals,knowledge_audit_logs
The diagnostic run is the central transaction. A single diagnostic_runs row spawns hypotheses, evidence items, imperfection findings, contradiction findings, root cause chains, reliability assessments, and maintenance plans. This star structure makes it possible to trace any recommendation back through the full reasoning chain to the original sensor data.
The IMS as Data
Section titled “The IMS as Data”100 Rows x 34 Columns
Section titled “100 Rows x 34 Columns”The Integrated Master Schema (IMS) is the flattened, denormalized view of the eight normalized tables. Each of its 100 rows represents a complete diagnostic pathway — from an equipment template through its failure mode, sensor evidence, RCM decision, and maintenance action, all the way to its dashboard output mapping.
The 34 columns break into logical groups:
| Group | Columns | Purpose |
|---|---|---|
| Asset identity | asset_template_id, asset_type, subsystem, component, criticality | What equipment are we talking about? |
| Functional context | function_description, failure_description | What function does it serve, and what does failure look like? |
| Failure mechanism | failure_mode, failure_effect, severity_code | How specifically does it fail? |
| Sensor evidence | sensor_type, evidence_description, confidence_rule, detection_method | What sensor readings indicate this failure? |
| RCM decision | recommended_strategy, task_interval, decision_basis | What maintenance strategy does RCM prescribe? |
| Maintenance action | task_description, skill_required, estimated_duration | What specific work needs to be done? |
| Dashboard output | kpi_widget, chart_type, alert_threshold, display_priority | How does this appear on screen? |
| Cross-references | ims_id, all foreign key IDs | How do the normalized tables link? |
Loading the IMS: The ReferenceLoader Pattern
Section titled “Loading the IMS: The ReferenceLoader Pattern”The IMS is loaded once at application startup and cached for the lifetime of the process. The ReferenceLoader pattern works as follows:
- On first access: Read all normalized CSV files (or query the database), join them via
schema_relation_map, and build an in-memory lookup structure keyed byasset_typeandfailure_mode_id. - Lazy caching: The loaded data is stored as an immutable dictionary. Subsequent lookups hit the cache, not the database.
- Reload on demand: The
/admin/schema/reloadendpoint invalidates the cache and forces a fresh load — useful after adding new IMS rows.
This pattern keeps diagnostic rule evaluation fast (sub-millisecond lookups) while avoiding the overhead of per-request database queries for reference data that changes infrequently.
Extending the IMS
Section titled “Extending the IMS”The IMS is designed to grow by adding rows, not code. To add support for a new failure mode:
- Add one row to
asset_master(if the equipment type is new) - Add one row to
functional_failures(what function fails) - Add one row to
failure_modes(the specific mechanism) - Add one row to
sensor_evidence_rules(how to detect it) - Add one row to
rcm_rules(what maintenance strategy to apply) - Add one row to
maintenance_tasks(what work to perform) - Add one row to
dashboard_output_mappings(how to display it) - Add one row to
schema_relation_map(link them all together)
No Python code changes. No deployment. The next /admin/schema/reload call picks up the new row, and the diagnostic engine can immediately evaluate it. This is the meaning of “rules are data, not code.”
Seed Data Strategy
Section titled “Seed Data Strategy”320 Failure Modes Across 8 Equipment Types
Section titled “320 Failure Modes Across 8 Equipment Types”The seed data provides a working diagnostic knowledge base from day one. The 320 failure modes are distributed across 8 equipment types:
| Equipment Type | Modes | FM ID Range |
|---|---|---|
| Bearing | 60 | FM0001-FM0020, FM0141-FM0160, FM0281-FM0300 |
| Pump | 60 | FM0021-FM0040, FM0161-FM0180, FM0301-FM0320 |
| Electric Motor | 40 | FM0041-FM0060, FM0181-FM0200 |
| Gearbox | 40 | FM0061-FM0080, FM0201-FM0220 |
| Fan | 30 | FM0081-FM0095, FM0221-FM0235 |
| Compressor | 30 | FM0096-FM0110, FM0236-FM0250 |
| Conveyor | 30 | FM0111-FM0125, FM0251-FM0265 |
| Hydraulic System | 30 | FM0126-FM0140, FM0266-FM0280 |
Bearings and pumps receive the heaviest coverage (60 modes each) because they are the most common rotating equipment in industrial plants and the most frequent targets for vibration-based diagnostics.
SQL Insertion Order (Referential Integrity)
Section titled “SQL Insertion Order (Referential Integrity)”The seed data must be loaded in a specific order to satisfy foreign key constraints. The master runner script (00_run_all_seed_inserts.sql) enforces this:
BEGIN;
1. asset_master -- equipment templates (no dependencies)2. functional_failures -- references asset types3. failure_modes -- references asset types4. sensor_evidence_rules -- references asset types and failure modes5. rcm_rules -- references failure modes and strategies6. maintenance_tasks -- references asset types and strategies7. dashboard_output_mappings -- references asset types and KPIs8. schema_relation_map -- references all seven tables above9. table_dictionary -- metadata about the tables themselves
COMMIT;The entire seed load is wrapped in a single transaction. If any INSERT fails (column-shift defect, duplicate key, FK violation), the entire batch rolls back cleanly. This prevents partial loads that would leave the IMS in an inconsistent state.
The Seed Manifest
Section titled “The Seed Manifest”The seed_insert_manifest.csv tracks what each SQL file contains:
| Table | Source CSV | Output SQL | Row Count |
|---|---|---|---|
| asset_master | asset_master.csv | asset_master_seed_inserts.sql | 100 |
| functional_failures | functional_failures.csv | functional_failures_seed_inserts.sql | 100 |
| failure_modes | failure_modes.csv | failure_modes_seed_inserts.sql | 100 |
| sensor_evidence_rules | sensor_evidence_rules.csv | sensor_evidence_rules_seed_inserts.sql | 100 |
| rcm_rules | rcm_rules.csv | rcm_rules_seed_inserts.sql | 100 |
| maintenance_tasks | maintenance_tasks.csv | maintenance_tasks_seed_inserts.sql | 100 |
| dashboard_output_mappings | dashboard_output_mappings.csv | dashboard_output_mappings_seed_inserts.sql | 100 |
| schema_relation_map | schema_relation_map.csv | schema_relation_map_seed_inserts.sql | 100 |
| table_dictionary | table_dictionary.csv | table_dictionary_seed_inserts.sql | 8 |
Note the discrepancy: the normalized schema seeds 100 failure modes, while the full master library provides 320. The additional 220 modes (FM0101-FM0320) exist in failure_modes_full_320_seed.sql but do not yet have corresponding sensor evidence rules, RCM rules, maintenance tasks, or dashboard mappings. Extending the normalized schema to cover all 320 modes is a documented gap.
pgvector for Semantic Search
Section titled “pgvector for Semantic Search”Why Vector Similarity
Section titled “Why Vector Similarity”Engineers do not query diagnostic systems with SQL. They say things like “my pump is vibrating and running hot” or “bearing noise getting worse after last overhaul.” These natural language descriptions need to be matched against the IMS’s 100 sensor evidence descriptions and 320 failure mode names.
Traditional keyword search fails here because the vocabulary is inconsistent — an engineer might say “shaking” when the IMS says “elevated RMS velocity,” or “overheating” when the evidence rule says “bearing temperature rise above baseline.”
Vector similarity search solves this by converting both the query and the IMS descriptions into high-dimensional embeddings, then finding the closest matches by cosine distance. The query does not need to use the exact words in the evidence description — it needs to be semantically close.
Embedding Strategy
Section titled “Embedding Strategy”The embedding pipeline works on rule and evidence descriptions:
- Source text: Each
sensor_evidence_rules.evidence_descriptionandfailure_modes.failure_modevalue is concatenated with itsasset_typeto form a composite text. - Embedding model: Text is converted to vectors using an embedding model (e.g., OpenAI
text-embedding-3-smallat 1536 dimensions, or a local model for air-gapped deployments). - Storage: Vectors are stored in a
pgvectorcolumn alongside the source row, enabling similarity queries without a separate vector database. - Indexing: An IVFFlat or HNSW index on the vector column enables sub-second similarity search across the full IMS.
Embeddings are generated once when seed data is loaded and regenerated on /admin/schema/reload. Since the IMS changes infrequently (adding rows, not updating existing ones), embedding freshness is not a concern.
Query Pattern
Section titled “Query Pattern”A typical semantic search flow:
User input: "my pump is vibrating and running hot" | vEmbed the query text -> [0.012, -0.034, 0.089, ...] | vSELECT failure_mode_id, failure_mode, asset_type, 1 - (embedding <=> query_embedding) AS similarityFROM failure_modesWHERE asset_type = 'Pump'ORDER BY embedding <=> query_embeddingLIMIT 5; | vResults: 1. FM0023 "Bearing overheating" (similarity: 0.87) 2. FM0025 "Seal failure with temperature rise" (similarity: 0.82) 3. FM0021 "Cavitation" (similarity: 0.71)The results feed into the diagnostic engine as candidate failure modes, bypassing the need for the engineer to know the exact FM ID or navigate the IMS manually. This is the foundation for the Engineering Copilot — a natural language interface to RAPID AI’s rule-based reasoning.
Time-Series Data
Section titled “Time-Series Data”Sensor Readings Table Design
Section titled “Sensor Readings Table Design”The telemetry_events table captures all sensor data flowing into RAPID AI:
| Column | Type | Purpose |
|---|---|---|
telemetry_id | UUID | Primary key |
tenant_id | UUID | Multi-tenant isolation |
asset_id | UUID | Which equipment |
sensor_id | UUID | Which sensor (nullable for manual readings) |
event_time | timestamptz | When the reading was taken (indexed) |
metric_code | text | What was measured (rms, temp, rpm, pressure) |
metric_value | numeric | The measurement value |
unit | text | Engineering unit (mm/s, degrees C, RPM) |
quality_code | text | Data quality flag (valid, suspect, missing) |
ingestion_source | text | How it arrived (stream, file, manual) |
This design uses a narrow-table pattern: each reading is one row with a metric_code discriminator, rather than one wide row with a column per metric. This accommodates assets with different sensor configurations without schema changes — a pump with 3 sensors and a gearbox with 7 sensors both write to the same table.
Partitioning Strategy
Section titled “Partitioning Strategy”At production scale, telemetry_events will be the largest table by orders of magnitude. Two partitioning strategies apply:
By time (primary): Monthly range partitions on event_time. This aligns with the most common query pattern — “show me the last 30 days of readings for this asset” — and enables efficient partition pruning. Old partitions can be detached and moved to cold storage without affecting current queries.
By asset (secondary): Within each time partition, a hash partition on asset_id distributes data evenly across storage. This prevents hot spots when a high-frequency sensor floods the system with readings.
Retention Policy
Section titled “Retention Policy”| Tier | Retention | Granularity | Storage |
|---|---|---|---|
| Hot | 90 days | Raw readings (every sample) | Primary PostgreSQL |
| Warm | 1 year | Hourly aggregates (min, max, avg, stddev) | Partitioned PostgreSQL |
| Cold | 5+ years | Daily aggregates | Object storage (Parquet files) |
Raw waveform data (time-series arrays from accelerometers) is never stored in the relational database. It goes to object storage via the waveform_files table, which stores only the metadata (storage URI, sampling rate, duration) alongside a reference to the asset and sensor.
Data Quality
Section titled “Data Quality”Module A as the Data Quality Gate
Section titled “Module A as the Data Quality Gate”Module A (Signal Validation) is the first module in the pipeline, and its primary purpose is to prevent bad data from entering the system. It applies 19 GUARD rules (DG001-DG019) that check for:
- NaN values in sensor readings
- Flatline signals (sensor stuck at one value)
- Signal clipping (amplitude exceeding sensor range)
- Sampling rate violations
- Unit mismatches
- Excessive noise floors
Each GUARD rule produces either a hard block (signal rejected, downstream modules receive nothing) or a soft penalty (signal accepted with a reduced quality score). The quality score is a multiplicative product of all triggered penalties:
quality_score = product(penalty_i for each triggered rule)A signal with no issues scores 1.0. A signal with two soft penalties of 0.8 and 0.9 scores 0.72. This score propagates downstream through every subsequent module’s confidence calculation.
How Bad Data Degrades Confidence Downstream
Section titled “How Bad Data Degrades Confidence Downstream”Confidence propagation is the mechanism by which Module A’s quality assessment affects every downstream result. The formula:
Canonical reference: See Chapter 6 for the authoritative confidence propagation formula.
C_final = Q_data x (1 - product(1 - C_i))Where Q_data is Module A’s quality score and C_i are the individual rule confidence scores from Modules B through E. If Q_data = 0.72 (marginal signal quality), even a perfectly matched failure mode rule with C_i = 0.95 produces a final confidence of only 0.72 x 0.95 = 0.68. The system is honest about uncertainty — it will not claim high confidence from low-quality data.
This design means that RAPID AI degrades gracefully. A noisy sensor does not produce false alarms — it produces lower-confidence diagnoses that explicitly flag the data quality concern.
Data Validation Rules at the Database Level
Section titled “Data Validation Rules at the Database Level”Beyond Module A’s runtime checks, the database schema enforces structural integrity:
Temporal constraints:
analysis_request.end_ts >= analysis_request.start_tsruntime_interval.end_ts > runtime_interval.start_tsfailure_event.failure_end_ts >= failure_event.failure_start_tssensor_evidence_summary.window_end_ts > sensor_evidence_summary.window_start_ts
Referential constraints:
- Every
metric_resultrow must have exactly one parent — eitherasset_result_idorarea_result_id, never both, never neither. - Every
dependency_edgemust reference source and target nodes within the samegraph_id. - Every
metric_result.metric_codemust exist inmetric_catalog.
Quality tracking:
- The
quality_codecolumn ontelemetry_events(valid/suspect/missing) enables downstream queries to filter or weight readings by data quality. - The
quality_flag_resulttable stores data sufficiency and confidence outputs separately from metric results, making it possible to audit why a particular analysis produced low-confidence results.
These constraints ensure that even if application-level validation fails, the database will reject structurally invalid data. Defense in depth — the same principle that makes RAPID AI’s diagnostic pipeline reliable also governs its data layer.
Next: Chapter 16 — Roadmap
Standards Alignment
Section titled “Standards Alignment”| Standard | Relevance to This Chapter |
|---|---|
| ISO 14224 — Reliability and maintenance data | The normalized schema (8 tables x 100 rows) and reliability schema (18 tables) implement ISO 14224’s standardized data structure for equipment taxonomy, failure modes, maintenance events, and reliability metrics. |
| ISO 13374 — Condition monitoring and diagnostics of machines | The data architecture supports ISO 13374’s processing chain by providing structured storage for each processing level’s inputs and outputs, from raw sensor data through diagnostic results to advisory outputs. |
| MIMOSA OSA-CBM — Open System Architecture for CBM | The schema_relation_map table implements OSA-CBM’s data interoperability requirements by maintaining traceable links between asset hierarchy, failure modes, detection rules, and maintenance actions. |
Changelog
Section titled “Changelog”| Version | Date | Author | Changes |
|---|---|---|---|
| 2.1.0 | 2026-03-17 | Rick D | Added standards alignment, living doc metadata, changelog |
| 2.0.0 | 2026-03-17 | Rick D | Enriched with production codebase content |
| 1.0.0 | 2026-03-17 | Rick D | Initial chapter creation |