NASA Logo, National Aeronautics and Space Administration

Results

We have experimentally evaluated our approach on a number of ADAPT fault scenarios. In order to enable RTOS embedding, the ADAPT BN was compiled (off-line) into an arithmetic circuit, which was then evaluated on-line. A unique point compared to previous Bayesian network-based research on EPS diagnosis is how we reduced a complex diagnostic search space into an arithmetic circuit (supported by a small-footprint arithmetic circuit evaluator). Compiling an ADAPT BN, which contains over 400 nodes representing over 100 EPS components, to an arithmetic circuit, and evaluating it using the arithmetic circuit evaluator (ACE), turns out to give accurate diagnostic results as well inference times that are less than one millisecond for all our fault scenarios. This is a successful demonstration of our approach on a real-world problem of great importance to NASA.

We now turn to experiments using ADAPT and different inference algorithms. Experiments are divided into two sets: hand crafted, real-world scenarios from ADAPT and simulated scenarios that were automatically generated from an ADAPT BN. In both cases, we executed probabilistic queries over the health variables in order to find out which components or sensors, if any, were in non-healthy states.

The ACE system was used to (i) compile an ADAPT BN into an arithmetic circuit and (ii) evaluate that arithmetic circuit. The timing measurements reported here were made on a PC with an Intel 4 1.83 GHz processor, 1 GB RAM, and Windows XP.

Experiments using Real-World Data

ID Fault Description Diagnosis Match
304 Relay EY2 failed open Health_relay_ey2_cl = stuckOpen Yes
305 Relay feedback sensor ESH175 failed Health_relay_ey175_cl = stuckOpen Yes
306 Circuit breaker ISH262 tripped Health_breaker_ey262_op = stuckOpen Yes
308 Voltage sensor E261 failed Health_e261 = stuckVoltageLo Yes
309 Battery BATT1 voltage low Health_battery1 = stuckDisabled Yes
310 Inverter INV1 failed off Health_inv1 = stuckOpen Yes
311 Load sensor LT500 failed Health_LT500 = stuckLow Yes

Diagnostic results for different fault scenarios (with IDs 304, 305, ...) for the electrical power system testbed ADAPT.

For experimentation using real-world data, EPS failure scenarios were generated using the ADAPT EPS at NASA Ames. These scenarios cover both component failures (experiments 304, 306, 309, and 310 in the table above) and sensor failures (experiments 305, 308, and 311); many previous efforts have only considered one type of failure. After ADAPT system reconfigurations and fault insertion (for example insertion of Relay EY260 failed open -- see ID 304 in the table above), the ADAPT BN or an arithmetic circuit compiled from it is used to compute a diagnosis. The variant of the ADAPT BN used here was largely auto-generated and contains 434 nodes and 482 edges; the BN node cardinalities range from 2 to 4 with mean 2.27. ACE was used to compute most probable explanations (MPEs) and most likely values (MLVs). To compute maximum aposteriori probability (MAP), SamIam was used. Here are the timing results for ACE:

result graph

Execution time results, in milliseconds, for ACE for the ADAPT testbed when computing diagnoses using the most probable explanation (MPE).

result graph

Execution time results, in milliseconds, for ACE for the ADAPT testbed when computing diagnoses using the most likely value (MLV).

The results of the ADAPT experiments are provided in the result table and figures above. Since there is over 120 nodes, we only show the variables deemed to be non-healthy in the table. Further, the diagnostic results of the MPE, MLV, and MAP queries turned out to be the same; hence we consolidate them into one column called “Diagnosis” in the table. ADAPT uses a 2 Hz sampling rate, and a probabilistic query was posed to ACE after each sample in an experimental run. The execution time statistics displayed in the above figures are based on the execution times for all probabilistic queries during an experimental run. Each execution time is for an entire inference step, i.e. translating measurements to evidence, committing evidence to the arithmetic circuit, and evaluating the arithmetic circuit.

Our main observations regarding these experiments are as follows. First, we see in the table above that the different diagnostic queries correctly diagnose all these component and sensor failure scenarios. Second, we emphasize the fast and predictable inference times for the ACs in the timing figures above. These are both very important factors in real-time electrical power system health management.

Experiments using Simulated Data

Simulated data was created by a program that (i) generated a set of failure scenarios according to the probabilities of the ADAPT BN's health nodes, and (ii) for each failure scenario, generated an evidence set on sensor nodes. This large number of evidence sets was then run through different inference systems. In addition to arithmetic circuit evaluation (ACE), we performed experiments with variable elimination (VE) and clique tree propagation (CTP).

Inference
Time (ms)
MPE Marginals
VE ACE CTP ACE
Minimum 17.25 0.17 8.527 0.4934
Maximum 38.45 2.779 54.51 5.50
Median 17.63 0.1995 9.204 0.24
Mean 17.79 0.2370 10.02 0.6981
St. Dev. 1.513 0.2137 4.451 0.6669

Results for different inference algorithms (VE, ACE, and CTP) when computing MPEs and marginals using data generated from the ADAPT BN.

The table above summarizes the results of experiments with 200 simulated evidence sets generated from the ADAPT BN. ACE is, on average, over 75 times faster than VE when computing. In addition, we note how ACE can compute all marginals using just slightly more time than what is used for MPEs. In other words, ACE can compute over 400 probabilities 25 times faster than VE computes a single probability. CTP can be used to compute marginals in order to overcome VE's limitation of computing only one probability at a time, but even CTP is over 14 times slower and has higher standard deviation than ACE.

In summary, VE, CTP, and ACE all run quite efficiently on the ADAPT system, but ACE is one or two orders of magnitude more efficient than the other algorithms, while having lower standard deviation. Diagnostic inference for ADAPT is therefore very efficient for two reasons. First, the BN was carefully generated, using our novel auto-generation algorithm, in a manner that supports efficient inference using any reasonable exact inference algorithm. Second, the particular arithmetic circuit algorithms we have emphasized here, as implemented in ACE, provide very large additional gains.

First Gov logo
NASA Logo - nasa.gov