The Robust Software Engineering area developed AutoBayes, a program synthesis system that, given a high-level specification of a statistical model, automatically constructs a C/C++ program that analyzes data in accordance with the model. Now a user manual has been produced as NASA technical memorandum NASA/TM—2008–215366 available from http://www.sti.nasa.gov/. The guide was written by Johann Schumann, Hamed Jafari, Tom Pressburger, and Ewen Denney, with example models from Wray Buntine and Bernd Fischer, and was reviewed by Kanishka Bhaduri and John Stutz.

AutoBayes has a wide-spread application potential at NASA, industry, and academia. It has been and is being applied to problems across NASA’s Exploration, Aeronautics, and Space mission directorates, including: analysis of simulation results for Orion abort and reentry scenarios; small-satellite guidance, navigation, and control; aircraft trajectory data mining; planetary nebula shape analysis; galaxy survey clustering; and earth science data clustering. AutoBayes makes statistical analysis faster and more reliable because effort can be focused on model development and validation rather than manual development of solution algorithms and code, which instead AutoBayes handles automatically. The manual is intended to provide assistance to AutoBayes users by containing information on applications for which AutoBayes has been used; installing and configuring the AutoBayes system; constructing AutoBayes specification models; data analysis using AutoBayes clustering methods; algorithms used in the synthesized code; and many example models of problems from areas such as parameter estimation, clustering, time-series analysis, and reliability modeling.

BACKGROUND: Program synthesis is the systematic, automated construction of efficient executable code from high-level declarative specifications. AutoBayes is a fully automatic program synthesis system for the statistical data analysis domain; in particular, it solves parameter estimation problems. It has seen many successful applications at NASA and is currently being used, for example, to analyze simulation results for Orion.

The input to AutoBayes is a concise description of a data analysis problem composed of 1) a parameterized statistical model and 2) a maximization goal that is a probability term involving parameters and input data. The output of AutoBayes is optimized and fully documented C/C++ code that, given input data, computes most-probable values for those parameters that maximize the probability term. Parameter estimation, clustering, and change point detection type statistical analysis problems can be described in this fashion. The output code can be linked dynamically into Matlab, Octave, and other environments. AutoBayes uses Bayesian networks internally to decompose complex statistical models and to derive algorithms for their solution. Its powerful symbolic system enables AutoBayes to solve many sub-problems symbolically rather than having to rely on numerical approximation algorithms, thus yielding effective, efficient, and compact code.

AutoBayesTeam: Hamed Jafari (National Space Grant Association), Tom Pressburger (NASA), Ewen Denney (RIACS)

Collaborators: Wray Buntine (NICTA, Australia), Bernd Fischer (Univ. Southampton, UK), Kanishka Bhaduri (SGT), John Stutz (NASA)

NASA PROGRAM FUNDING: Exploration Technology Development Program

Contact: Johann Schumann

02/13/2009