Machine Dependency and System Topology
Chapter 27: Machine Dependency & System Topology
Section titled “Chapter 27: Machine Dependency & System Topology”27.1 The Machine as a System
Section titled “27.1 The Machine as a System”No machine operates in isolation. Every rotating asset in an industrial plant exists within a web of mechanical, thermal, electrical, and process connections. A centrifugal pump does not simply “pump.” It is driven by a motor, connected through a coupling, mounted on a shared baseplate, fed by a suction piping system, discharging into a process header, lubricated by an oil system, cooled by a seal flush plan, and controlled by instrumentation. When we diagnose that pump, we are really diagnosing a system.
The concept of a machine train captures this reality. A typical machine train in a refinery might look like:
Electric Motor → Flexible Coupling → Centrifugal Pump → Mechanical Seal → Process Piping ↓ ↓ ↓ ↓ Lubrication Alignment Bearing Oil Seal Flush System State System PlanEach element in this train has its own failure modes, its own degradation curves, and its own diagnostic signatures. But critically, each element’s health affects every other element. A motor with a developing rotor bar defect will produce torque pulsations. Those pulsations transmit through the coupling to the pump, where they manifest as vibration that can be misdiagnosed as a pump problem. A pump operating in cavitation generates broadband vibration that transmits backward through the coupling to the motor bearings, accelerating their wear.
Failure propagation is the central concern. Consider this real-world cascade:
- A cooling fan bearing begins to degrade (Stage 1 — no detectable symptoms on the pump itself).
- The fan loses efficiency. Lube oil temperature rises by 8 degrees Celsius.
- Oil viscosity drops. The lubricant film in the pump bearings thins.
- Pump bearing temperatures rise. Wear accelerates from normal to aggressive.
- Bearing clearance increases. Shaft orbit grows. The mechanical seal begins to see excessive radial movement.
- Seal face wear accelerates. Leakage begins.
- Process fluid leaks. Environmental alarm. Emergency shutdown.
The root cause was a $40 fan bearing. The consequence was a $500,000 shutdown. The diagnostic challenge is that at Step 5, every sensor on the pump is screaming “pump problem,” but the pump was a victim, not a perpetrator.
This is why system topology matters. Without understanding the dependency chain, diagnostics become a game of treating symptoms while the root cause festers upstream.
27.2 Dependency Mapping
Section titled “27.2 Dependency Mapping”To reason about machine systems, we must first map their dependencies. There are three fundamental types.
Functional Dependencies
Section titled “Functional Dependencies”Functional dependencies answer: “What must be running for this machine to operate?”
A centrifugal pump requires its driver motor to be running. The motor requires its variable frequency drive (if equipped) to be commanding speed. The VFD requires the control system to be issuing a run command. The pump requires suction pressure from an upstream vessel. It requires discharge backpressure to be within its curve. It requires seal flush flow. It requires lube oil supply.
Functional dependencies are typically hierarchical:
Process Demand └── Pump P-101A (or standby P-101B) ├── Motor M-101A │ ├── Electrical Supply (MCC Bus 3) │ └── VFD-101A ├── Coupling C-101A ├── Lube Oil System LO-101 │ ├── Oil Pump LP-101 │ └── Oil Cooler OC-101 └── Seal System SS-101 ├── Seal Flush Pump SP-101 └── Seal Flush Cooler SC-101If the oil cooler OC-101 fails, both P-101A and its motor M-101A are at risk — they share the lubrication system. This is a functional dependency that crosses machine boundaries.
Structural Dependencies
Section titled “Structural Dependencies”Structural dependencies arise from physical connections:
- Shared foundations: Two machines on the same baseplate or concrete foundation. Foundation degradation (cracking, grout deterioration, loose anchor bolts) affects both machines simultaneously.
- Shared piping: Suction and discharge piping transmits vibration between machines. A reciprocating compressor on a shared header can excite pulsation in connected vessels and piping.
- Shared utilities: Common lube oil header, common cooling water supply, common instrument air.
- Proximity effects: Thermal radiation from a hot machine can affect an adjacent machine’s bearing temperatures. Vibration from a large machine can transmit through the floor to sensitive nearby equipment.
Operational Dependencies
Section titled “Operational Dependencies”Operational dependencies are imposed by the process:
- Throughput coupling: If pump P-101A feeds reactor R-201, the reactor’s demand dictates the pump’s operating point. Running the pump off its best efficiency point (BEP) because of process constraints accelerates wear.
- Switchover dependencies: Standby pump P-101B must be ready when P-101A trips. If P-101B has a latent defect (discovered only on start), the “redundancy” is an illusion.
- Start/stop sequences: Some machines must start in a specific order. Starting a compressor before establishing seal oil flow destroys the seals in seconds.
How RAPID AI Maps Dependencies
Section titled “How RAPID AI Maps Dependencies”In RAPID AI, dependency mapping is configured in Module 0 — the system configuration layer. Module 0 defines:
- Machine train topology: which assets are mechanically connected
- Shared utility mappings: which assets share lube oil, cooling, electrical supply
- Process train linkages: upstream/downstream process relationships
- Redundancy configurations: A/B standby pairs, k-out-of-n arrangements
- Criticality inheritance: a non-critical lube oil pump becomes critical when it serves a critical compressor
This topology is not decorative metadata. It is actively used by Modules C and G to propagate diagnostic evidence across machine boundaries.
27.3 Cascading Failure Analysis
Section titled “27.3 Cascading Failure Analysis”Cascading failures follow predictable patterns. Understanding these patterns turns reactive firefighting into proactive interception.
Single-Machine Cascades
Section titled “Single-Machine Cascades”Within a single machine, failure cascades follow thermodynamic and mechanical logic:
Misalignment cascade:
Angular misalignment (0.003" offset) → Cyclic axial loading on coupling (2× running speed) → Coupling element fatigue (grid wear, disc cracking) → Increased radial bearing load (coupling-end bearing) → Elevated bearing temperature (+15°C) → Lubricant viscosity reduction → Reduced film thickness → Metal-to-metal contact → Bearing failure (4-8 weeks from initial misalignment)Imbalance cascade:
Rotor imbalance (fouling deposit on impeller) → Synchronous vibration increase (1× running speed) → Dynamic bearing load increase → Shaft deflection at seal → Seal face distortion → Seal leakage → Process loss / environmental eventCross-Machine Cascades
Section titled “Cross-Machine Cascades”Cross-machine cascades are more insidious because the symptoms appear on the wrong machine:
Pump cavitation → piping → adjacent equipment:
Pump P-101A cavitates (low NPSH available) → Broadband vibration (implosion of vapor bubbles) → Pressure pulsation in discharge piping → Piping vibration excites branch connections → Small-bore connection fatigue (vent valve, drain valve) → Piping leak at adjacent heat exchanger connectionMotor electrical fault → coupling → pump:
Motor M-101A develops stator winding fault → Unbalanced magnetic pull (UMP) → Radial force on motor shaft at 2× line frequency → Transmitted through coupling to pump shaft → Pump bearing sees unexpected 100/120 Hz vibration → Misdiagnosed as pump misalignment or loosenessThe IAR Framework Applied to Cascading Failures
Section titled “The IAR Framework Applied to Cascading Failures”When RAPID AI encounters evidence of multiple simultaneous faults, the IAR framework (Initiator-Accelerator-Result) helps classify each fault:
- Initiator: The root cause that started the cascade. Often a single-point failure with a clear time origin.
- Accelerator: Secondary conditions that speed up the cascade. Not the root cause, but not innocent either.
- Result: The final, visible failure. Usually what triggers the alarm or trip. Often not the root cause.
Example: Pump P-101A trips on high vibration.
| Evidence | IAR Classification | Explanation |
|---|---|---|
| Motor bearing defect (inner race) | Initiator | Started 6 weeks ago, progressed through Stages 1-3 |
| Coupling misalignment (angular) | Accelerator | Pre-existing condition, worsened by motor bearing clearance growth |
| Pump seal failure | Result | Consequence of excessive shaft movement from combined defects |
RAPID AI’s Module C fusion examines all blocks across all machines in a train. If Block F2 (bearing defect) fires on the motor at the same time Block H3 (seal distress) fires on the pump, and the topology map says they are in the same train, Module C evaluates the temporal sequence and the mechanical linkage to determine the cascade direction.
27.4 System Reliability Modeling
Section titled “27.4 System Reliability Modeling”Individual machine health matters, but the plant cares about system availability. RAPID AI computes system-level reliability from component-level health scores.
Series Systems
Section titled “Series Systems”A machine train is a series system: every component must function for the train to operate.
R_train = R_motor × R_coupling × R_pump × R_sealIf each component has 98% reliability over a one-year period:
R_train = 0.98 × 0.98 × 0.98 × 0.98 = 0.922 (92.2%)A train of four components, each 98% reliable, gives only 92.2% system reliability. This is why machine trains require more attention than their individual component reliabilities suggest.
Parallel Systems (Redundancy)
Section titled “Parallel Systems (Redundancy)”Standby configurations improve system reliability:
R_system = 1 - (1 - R_A) × (1 - R_B)Two pumps in A/B standby, each 92% reliable:
R_system = 1 - (1 - 0.922) × (1 - 0.922) = 1 - 0.078 × 0.078 = 0.994 (99.4%)But this assumes the standby pump is truly ready. In practice, standby equipment suffers from its own failure modes: standstill corrosion, seal dry-out, lubricant degradation, check valve failures, control logic faults. RAPID AI monitors standby equipment health specifically for standby-related degradation modes.
Common Cause Failures
Section titled “Common Cause Failures”The great destroyer of redundancy is the common cause failure. If both pumps share a lubrication system, and the lube oil is contaminated, both pumps fail simultaneously. The “redundancy” provides zero benefit.
R_system_with_CCF = 1 - (1 - R_A) × (1 - R_B) - P_CCFCommon cause failure probability (P_CCF) is typically 1-5% of total failure probability, but it dominates system-level risk because it defeats redundancy.
Common cause failure sources in practice: shared lube oil, shared cooling water, shared electrical supply (same bus), shared control system, shared foundation, shared environmental conditions (dust, temperature, humidity).
k-out-of-n Systems
Section titled “k-out-of-n Systems”Some systems require k out of n units to operate:
- 2 out of 3 cooling fans for an air-cooled heat exchanger
- 3 out of 4 boiler feed pumps for a power plant
- 1 out of 2 compressors for a refrigeration system
The reliability formula for k-out-of-n with identical components:
R_system = Σ(i=k to n) C(n,i) × R^i × (1-R)^(n-i)RAPID AI computes system-level SSI (System Severity Index) by combining component SSI values according to the topology:
SSI_system = f(SSI_components, topology_type, dependency_weights)For series topology, the system SSI is dominated by the worst component. For parallel topology, the system SSI reflects the combined risk of losing redundancy.
27.5 Process Criticality
Section titled “27.5 Process Criticality”Not all machine trains are equally important. A plant may have 500 rotating assets, but perhaps 50 of them are truly critical — their failure causes safety incidents, environmental releases, or production losses exceeding $100,000 per hour.
Equipment Criticality Matrix
Section titled “Equipment Criticality Matrix”Criticality assessment combines consequence and probability:
| Low Consequence | Medium Consequence | High Consequence | Extreme Consequence | |
|---|---|---|---|---|
| High Probability | Medium | High | Critical | Critical |
| Medium Probability | Low | Medium | High | Critical |
| Low Probability | Low | Low | Medium | High |
Consequence categories:
- Safety: Could failure injure or kill someone? (Highest weight)
- Environmental: Could failure cause a release to atmosphere, soil, or water?
- Production: What is the production loss per hour of downtime?
- Repair cost: What does it cost to fix, including secondary damage?
Probability factors:
- Historical failure frequency (MTBF from CMMS data)
- Current condition (from RAPID AI’s SSI score)
- Design margins (how close to limits is the machine operating?)
- Age and obsolescence
Criticality and Maintenance Strategy
Section titled “Criticality and Maintenance Strategy”Criticality drives the maintenance approach:
| Criticality | Monitoring Strategy | Maintenance Approach |
|---|---|---|
| Critical | Continuous online monitoring + RAPID AI | Predictive (condition-based) |
| High | Periodic monitoring (monthly) + RAPID AI | Predictive + some preventive |
| Medium | Periodic monitoring (quarterly) | Preventive with condition triggers |
| Low | Run to failure or basic inspections | Corrective (fix when broken) |
RAPID AI’s Criticality Factor K
Section titled “RAPID AI’s Criticality Factor K”RAPID AI incorporates criticality through the factor K (0 to 1), which scales the urgency of diagnostic outputs:
- K = 1.0: Critical asset. Any fault detection triggers immediate alert. RUL estimates carry tight confidence bounds. Risk index is amplified.
- K = 0.5: Medium criticality. Faults are reported on standard timelines. RUL estimates allow wider planning windows.
- K = 0.1: Low criticality. Only severe faults (Stage 3+) trigger alerts. RUL is estimated but does not trigger automatic work orders.
K appears in the risk index calculation:
Risk_Index = K × (1 - SSI/100) × Consequence_FactorThis ensures that a Stage 2 bearing defect on a critical boiler feed pump commands far more attention than a Stage 3 bearing defect on a non-critical cooling water pump.
27.6 Dependency-Aware Diagnostics
Section titled “27.6 Dependency-Aware Diagnostics”The practical implication of all this topology and dependency modeling is that diagnostics must be dependency-aware. When RAPID AI diagnoses a machine, it does not look at that machine in isolation. It evaluates the entire train.
The Diagnostic Sequence
Section titled “The Diagnostic Sequence”When vibration on pump P-101A increases:
- Evaluate P-101A: Run all diagnostic blocks (F1-F7, H1-H5, etc.) for the pump.
- Evaluate M-101A: Run diagnostic blocks for the motor. Is the motor generating the vibration?
- Evaluate the coupling: Check alignment indicators, coupling-frequency signatures.
- Evaluate the process: Is the pump operating off-BEP? Is NPSH adequate? Is the discharge valve partially closed?
- Evaluate shared systems: Lube oil condition, cooling water temperature, foundation integrity.
- Cross-reference timing: Did the motor fault precede the pump fault? Module G’s temporal analysis identifies which came first.
Example: The Misdiagnosed Pump
Section titled “Example: The Misdiagnosed Pump”A real-world scenario illustrates the value of dependency-aware diagnostics:
Symptoms on Pump P-301A:
- Vibration at 1× RPM increased from 2.5 mm/s to 7.8 mm/s over 3 weeks
- Bearing temperature increased from 65°C to 78°C
- Axial vibration increased
Initial diagnosis (without dependency awareness): Pump imbalance or internal wear. Recommended action: pull pump for overhaul.
RAPID AI diagnosis (with dependency awareness):
- Motor M-301A shows developing 2× line frequency vibration (electrical fault signature)
- Coupling alignment check reveals angular misalignment has increased from 0.002” to 0.006” — consistent with motor bearing wear changing the shaft position
- Motor bearing Block F2 shows Stage 2 inner race defect at BPFI frequency
- The pump itself shows no defect frequencies — its elevated vibration is transmitted from the motor
Correct diagnosis: Motor bearing defect causing shaft position change, inducing misalignment, transmitting vibration to pump. Fix the motor bearing and realign. The pump does not need to be opened.
Savings: Avoided a $45,000 unnecessary pump overhaul. Replaced a $800 motor bearing instead. Downtime: 8 hours instead of 72 hours.
FRETTLSM Across Machine Boundaries
Section titled “FRETTLSM Across Machine Boundaries”RAPID AI’s FRETTLSM framework (Frequency, Resolution, Energy, Time, Trend, Load, Speed, Modulation) applies across machine boundaries in a train:
- Frequency: Does the suspect frequency on machine A correspond to a known defect frequency of machine B? (e.g., motor BPFI appearing in pump spectrum)
- Energy: Is the energy pattern consistent with transmission through a coupling? (Typically attenuated by 30-50% across a flexible coupling)
- Trend: Do the trend start times correlate between machines? (Simultaneous onset suggests common cause; sequential onset suggests cascade)
- Load/Speed: Does the fault signature change with operating conditions in a way consistent with the suspected source machine?
Module C fusion aggregates evidence across all machines in the train, weights it by the dependency topology, and produces a train-level diagnostic that identifies the initiating fault, the cascade path, and the appropriate corrective action for each element.
This is the essence of system thinking applied to machinery diagnostics. The machine is not the unit of analysis. The system is.
Machine-Specific Diagnostic Approaches
Section titled “Machine-Specific Diagnostic Approaches”Each machine type demands a different diagnostic emphasis. RAPID AI’s IMS captures this through asset-type-specific profiles.
Centrifugal Pumps
Section titled “Centrifugal Pumps”Primary Failure Modes: Seal failure, bearing degradation, impeller erosion, cavitation Key Monitoring Points: DE bearing (H, V, A), NDE bearing, seal housing, discharge pressure Critical Frequencies: 1X, vane pass (N_vanes x RPM), BPFO/BPFI Unique Challenges:
- Cavitation produces broadband high-frequency noise — easily confused with bearing damage
- Seal failures are sudden with little vibration warning
- Process changes (flow rate, head) directly affect vibration baseline
- Pain point: Dry running for even 30 seconds can destroy mechanical seals ($5,000-$50,000 replacement)
RAPID AI Profile: pump — weights vibration 35%, temperature 25%, process 25%, electrical 15%
Gearboxes
Section titled “Gearboxes”Primary Failure Modes: Tooth wear, tooth fracture, bearing failure, shaft misalignment Key Monitoring Points: Input shaft bearings, output shaft bearings, housing (near mesh) Critical Frequencies: GMF = N_teeth x RPM, hunting tooth frequency, natural frequencies Unique Challenges:
- Multiple gear stages create complex spectra
- Amplitude modulation distinguishes wear from damage
- Oil analysis is critical secondary indicator
- Pain point: Replacement gearboxes have 16-26 week lead times and cost $100K-$1M+
RAPID AI Profile: gearbox — weights vibration 45%, temperature 20%, oil_analysis 25%, electrical 10%
Electric Motors
Section titled “Electric Motors”Primary Failure Modes: Bearing failure, stator winding, rotor bar cracking, eccentricity Key Monitoring Points: DE bearing, NDE bearing, motor frame, current (MCSA) Critical Frequencies: 1X, 2xFL (120 Hz), rotor bar pass, slot harmonics Unique Challenges:
- VFD-driven motors have complex spectra (switching harmonics)
- Current analysis (MCSA) can detect rotor faults earlier than vibration
- Winding faults have very short P-F intervals (days to weeks)
- Pain point: Motor rewinds are common, but each rewind reduces insulation life ~20%
Fans/Blowers
Section titled “Fans/Blowers”Primary Failure Modes: Blade erosion, bearing failure, belt drive issues, structural resonance Key Monitoring Points: All bearings (tri-axial), blade pass, belt frequencies Critical Frequencies: 1X, blade pass (N_blades x RPM), belt rate Unique Challenges:
- Blade buildup changes balance dynamically (cement, coal, chemical deposits)
- Structural resonance exposed by VFD speed changes (see ID Fan case study, Ch22)
- High-temperature fans have thermal growth issues
- Pain point: Process-induced erosion means “fixed” unbalance returns within weeks
Compressors (Reciprocating)
Section titled “Compressors (Reciprocating)”Primary Failure Modes: Valve failure, piston ring wear, rod packing leaks, foundation looseness Key Monitoring Points: Cylinder valves (temperature, vibration), frame vibration, rod drop Critical Frequencies: 1X, valve impact, rod passing Unique Challenges:
- Reciprocating forces dominate spectrum — masking other faults
- Valve failures are sudden and can cause secondary damage
- Pressure pulsation analysis required alongside vibration
- Pain point: Compressor valve replacement costs $500-$5,000 per valve, but unplanned failure causes $50,000+ in secondary damage
Standards Alignment
Section titled “Standards Alignment”| Standard | Relevance to This Chapter |
|---|---|
| ISO 14224 — Reliability and maintenance data | The machine train dependency model and cascade failure analysis use ISO 14224’s equipment taxonomy and failure classification to structure system-level reliability assessment across interconnected assets. |
| ISO 17359 — General guidelines for condition monitoring | The system topology approach extends ISO 17359’s single-asset monitoring guidelines to multi-asset machine trains, enabling root cause identification across equipment boundaries. |
| ISO 55000/55001 — Asset management | The dependency graph analysis supports ISO 55000’s system-level asset management by quantifying how individual equipment failures propagate through interconnected systems, enabling risk-informed resource allocation. |
Changelog
Section titled “Changelog”| Version | Date | Author | Changes |
|---|---|---|---|
| 2.1.0 | 2026-03-17 | Rick D | Added standards alignment, living doc metadata, changelog |
| 2.0.0 | 2026-03-17 | Rick D | Enriched with production codebase content |
| 1.0.0 | 2026-03-17 | Rick D | Initial chapter creation |