NASA Logo, National Aeronautics and Space Administration

Mode Identification and Recovery

              <p>
               <strong>Here is an authentic example which shows how MIR works.</strong>
              </p>  
              <p>
                During the DS1 mission the spacecraft sends data, like photographs of the asteroid, via 
                its communication system, to the Deep Space Network (DSN).  The communication system 
                consists of many devices, which MIR calls components, which have specific functions and 
                work together to send the signal. A failure in any of these components could prevent 
                earth from receiving any signals.  It is very important that any failures are detected 
                and corrected; otherwise all information gathered during the mission could be lost.
              </p>
              <h2>Gathering Sensor Data</h2>
              <p>
                The first step in monitoring the health of a spacecraft is to gather information 
                regarding how the components and systems on the spacecraft are behaving.  Sensors are 
                placed throughout the spacecraft and constantly feed MIR such data.   There are sensors 
                which detect whether current is flowing through a circuit, whether valves are open or 
                closed, whether switches are on or off, etc.   Part of the challenge of spacecraft 
                engineering is to determine how many sensors are necessary and where they should be 
                located.  Sensors add mass to the spacecraft and use energy, so the fewer the better; 
                however, if there are too few, there will not be enough information to accurately 
                determine the status of the spacecraft.
              </p>
              <img src="/m/project/remote-agent/images/telecom2.png" alt="Telecom Graphic" height="230" width="490">
              <p>
                The diagram above shows the components of the telecommunication system. The signal flows 
                through the Small Deep Space Transponder (SDST) and is amplified by one of the power 
                conversion units (PCU A or PCU B.) After passing through the driver and diplexer, it is 
                directed by the waveguide switches (WTS 1, and WTS 2) to either the high gain antennae 
                (HGA) or one of the low gain antennae's (LGA +z or LGA +x.) There are sensors which 
                describe the level of current coming from the SDST, PCU A, and PCU B and which describe 
                if the WTS's are in position a or b.
              </p>
              <h2>Using Models to Detect Failures</h2>
              <p>
                In order to detect failures, it is important for MIR to know what the spacecraft's 
                systems should look like at any point in time.  The way MIR accomplishes this is through 
                the use of "models."  A model is a description of the components that make up a system on 
                the spacecraft, and the expected behavior or "modes" of those components.  The model 
                includes many different combinations of the expected behaviors of the components given 
                various situations.  For example, if  a radio signal is being sent through Power 
                Conversion Unit A (PCU A),  the model would show PCU A drawing power in order to 
                amplify the signal.
              </p>
              <p>
                MIR knows what should be happening on the spacecraft by eavesdropping on the Smart 
                Executive (EXEC) when it commands different parts of the spacecraft to carry out various 
                actions.  For example, DS1 has several antennas through which it transmits information.  
                If the spacecraft has been commanded to switch to  the +z axis low-gain antenna, and to 
                send the signal by way of Power Conversion unit A (PCU A), MIR hears that information.  
                Based on its model, MIR knows the "mode" that each component should be in and therefore, 
                what the sensors should read.  In this example the sensors should read that PCU A is 
                drawing power and the waveguide switches are both in the a position. The diagram below 
                illustrates the model given this information.
              </p>
              <img src="/m/project/remote-agent/images/telecom3.png" alt="Telecom Diagram 3" height="230" width="490">
              <p>
                MIR's model also includes the expected modes of the components given certain failures.  
                For example, the model, illustrated in the diagram below, shows that if PCU A were broken,
                it would not be drawing any power.
              </p>
              <img src="/m/project/remote-agent/images/telecom4.png" alt="Telecom Diagram 4" height="233" width="496">
              <p>
                Finally, MIR compares its model of what the spacecraft should look like to the actual 
                status of the spacecraft based on the sensor data.  If there is a conflict between the 
                two, MIR knows that some failure has taken place and searches its model to find out which 
                failure would give the current sensor reading.
              </p>
              <p>
                For example, in the scenario above (signal sent using +z antennae and PCU A)  the sensors 
                may show that although PCU A is drawing power, waveguide switch 2 is located in the b 
                position and not the a position. This situation is illustrated in the diagram below.
              </p>
              <img src="/m/project/remote-agent/images/telecom5.png" alt="Telecom Diagram 5" height="230" width="490">
              <h2>Identifying and Diagnosing a Failure</h2>
              <p>
                Once MIR detects conflicts in the expected and actual sensor information, it determines 
                the most likely cause for the conflict.  Determining which component or components are 
                actually failing and in what manner, in other words the "failure mode," is called 
                diagnosis.  In the example above, the model showed that waveguide switch 2 should be in 
                the a position whereas the actual sensor data is that the waveguide switch is in the b 
                position. There are three possible failure modes or combinations of failure modes that 
                could be causing this:
              </p>
              <ul>
                <li>the waveguide switch is temporarily stuck in the b position</li>
                <li>the waveguide switch is permanently stuck in the b position</li>
                <li>both sensors are failing and therefore reporting erroneous data</li>
              </ul>
              <p>
                Each failure mode is programmed with the probability of its happening.  The failure mode 
                that is the most likely to happen is the one that MIR first assumes is the correct 
                diagnosis.
              </p>
              <h2>Recovering from a Failure</h2>
              <p>
                The last step is to take action to recover from the failure.  In the example above, if a 
                temporarily stuck waveguide switch is the most likely failure, MIR will report this 
                diagnosis to EXEC. EXEC will then ask MIR for the best recovery action. Many recovery 
                actions may disrupt other spacecraft activities.  MIR takes this into account when 
                suggesting a recovery action. If the suggested recovery action doesn't work, MIR wil 
                discard this diagnosis and move on to the next most likely diagnosis.  If a failure is 
                permanent, MIR will report this to the EXEC.  The EXEC will either attempt to accomplish 
                the plan goals without using the failed component, or will request PS to generate a new 
                plan taking the failure into account, or will go into stand-by mode and wait for 
                instructions from Earth.
              </p>
              <h2>Monitoring Plan Execution</h2>
              <p>
                The above description focuses on failure detection and diagnosis, but MIR's constant 
                monitoring also provides  feedback when things are going as planned.  As stated before, 
                MIR compares the actual status of the spacecraft (based on sensor information) to the 
                model's prediction based on EXEC's commands and PS's plan.  If the predicted model and 
                the actual status match after a given command, MIR reports that the command was completed 
                successfully which signal EXEC to go on with the next part of the plan.
              </p>
First Gov logo
NASA Logo - nasa.gov