Publications

Selected publications (including collaborations, some with external funding):

Information Theory—The Bridge Connecting Bounded Rational Game Theory and Statistical Physics

David H. Wolpert

A long-running difficulty with conventional game theory has been how to modify it to accommodate the bounded rationality of all real-world players. A recurring issue in statistical physics is how best to approximate joint probability distributions with decoupled (and therefore far more tractable) distributions. This paper shows that the same information-theoretic mathematical structure, known as Product Distribution (PD) theory, addresses both issues. In this, PD theory not only provides a principled formulation of bounded rationality and a set of new types of mean field theory in statistical physics; it also shows that those topics are fundamentally one and the same. [In Y. Bar-Yam and D. Braha (eds.), *Complex Engineering Systems,* Perseus books, in print.]

A Comparative Study of Probability Collectives-based Multi-agent Systems and Genetic Algorithms

Chien-Feng Huang, Stefan Bieniawski, Charlie E. M. Strauss, and David H. Wolpert

We compare Probability Collectives (PC) -- a new framework for distributed optimization and control -- with genetic algorithms (GAs). PC-based methods do not update populations of solutions; instead, they update an explicitly parameterized probability distribution *p* over the space of solutions. (This optimizes a functional of *p,* chosen so that any *p* that optimizes it should be peaked about good solutions.) The approach works for both continuous and discrete problems, and has no resolution limitation from encoding parameters into finite bit-length GA alleles. We review the PC approach -- using its motivation as the information-theoretic formulation of bounded rationality for multi-agent systems -- and compare it with GAs on a diverse set of problems. To handle high-dimensional surfaces, *p* is here restricted to a product distribution. Each distribution in that product is controlled by a separate agent. On test functions known to be difficult, our PC-based approach significantly outperforms traditional GAs in rate of descent, avoidance of trapping in false minima, and long-term optimization. [Submitted to the *Genetic and Evolutionary Computation Conf. (GECCO-2005).*]

Distributed Optimization and Flight Control Using Collectives

Stefan Bieniawski

Sensors and actuators placed throughout an aircraft (including GPS and other external sensors) may result in a "nervous system" that will dramatically increase performance while reducing maintenance costs, but that add to system complexity. The nervous system and control techniques can also be extended to swarms of small, unmanned aerial vehicles (UAVs) or clusters of other types of agents. Inherently distributed and highly nonlinear components with complex and sometimes competing interactions present significant design challenges. Collectives -- specifically those of Probability Collectives (PC) theory -- have shown promise in distributed optimization for discrete, continuous, mixed, and constrained optimization problems. They are naturally applied to distributed systems and those involving uncertainty, such as control in the presence of noise and disturbances. This document describes PC theory and its implementation, including its connections to multi-agent systems, machine learning, statistics, and gradient-based optimization. Two experiments are included, building upon recent advances in actuator technology. One small, simple flow-control device under investigation is the Miniature Trailing Edge Effector (MiTE), which consists of sensing, logic, and a small, moveable surface mounted at a wing's trailing edge. High bandwidth, distributed placement, and good control authority make the device an ideal candidate for control of both rigid-body and flexible modes of a flight vehicle. This potential is demonstrated in two experiments: flutter suppression of a flexible wing (increasing the flutter speed by over 25%) and MiTE-based distributed flight control of a remotely piloted aircraft. [Ph.D. dissertation, Stanford University, Sep. 2005.]

Distributed Control by Lagrangian Steepest Descent

David H. Wolpert and Stefan Bieniawski

Often adaptive, distributed control can be viewed as an iterated game between independent players. The coupling between the players' mixed strategies -- arising as the system evolves from one instant to the next -- is determined by the system designer. Information theory tells us that the most likely joint strategy of the players, given a value of the expectation of the overall control objective function, is the minimizer of a Lagrangian function of the joint strategy. The goal of the system designer is to speed evolution of the joint strategy to that Lagrangian minimizing point, lower the expected value of the control objective function, and repeat. Here we elaborate the theory of algorithms that do this using local descent procedures, and that thereby achieve efficient, adaptive, distributed control. [In *Proc. Conf. on Decision and Control,* Bahamas, Dec. 14-17, 2004.]

Discrete, Continuous, and Constrained Optimization Using Collectives

Stefan Bieniawski, David H. Wolpert, and Ilan Kroo

Aerospace systems are comprised of many interacting components, some with competing requirements. The increasing complexity and performance requirements call for optimal design and control. Distributed optimization can sometimes exploit limited coupling of variables in these large systems, avoiding the centralized coordination and synchronous updates required by standard optimization approaches. Recent developments enhance the applicability of Probability Collectives (PC) theory to discrete, continuous, mixed, and constrained optimization problems. (A collective is a system where each agent is self-interested and capable of learning, together with a specified system objective that rates the performance of the joint actions of the agents.) This paper presents the theoretical underpinnings of the approach, with example problems. A constrained discrete structural optimization and a continuous trajectory optimization illustrate the breadth of the collectives approach. [*10th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conf.,* Albany, Aug. 30-Sep. 1, 2004.]

Product Distribution Theory for Control of Multi-agent Systems

Chiu Fan Lee and David H. Wolpert

Product Distribution (PD) theory is a new framework for controlling multi-agent systems (MAS). First we review one motivation of PD theory: the information-theoretic extension of conventional full-rationality game theory to the case of bounded rational agents. We consider a team game in which a shared utility measures performance of the MAS. The game is at equilibrium -- the Lagrangian of the probability distribution is optimized -- when the joint distribution of the agents optimizes the system's expected performance. One common way to find that equilibrium is to have each agent run a reinforcement learning algorithm. Here we exploit PD theory to run gradient descent on the Lagrangian. We also demonstrate how PD theory can improve performance even when we are not allowed to re-run the MAS from different initial conditions, a requirement implicit in some previous work. [In *Proc. 3rd Int'l Joint Conf. on Autonomous Agents and Multi-Agent Systems,* New York, July 19-23, 2004.]

Adaptive, Distributed Control of Constrained Multi-agent Systems

Stefan Bieniawski and David H. Wolpert

Product Distribution (PD) theory was recently developed as a framework for analyzing and optimizing distributed systems. In this paper, we demonstrate its use for adaptive distributed control -- i.e., distributed stochastic optimization -- of multi-agent systems (MAS). A traditional optimization has each agent run a reinforcement learning (RL) algorithm. PD theory allows a variant, using Newton's method operating on the agents' probability distributions. This approach outperforms an RL-based scheme in all three domains tested. [In *Proc. 3rd Int'l Joint Conf. on Autonomous Agents and Multi-Agent Systems,* New York, July 19-23, 2004.]

Product Distributions for Distributed Optimization

Stefan Bieniawski and David H. Wolpert

With connections to bounded rational game theory, information theory, and statistical mechanics, Product Distribution (PD) theory provides a new framework for performing distributed optimization. It also extends and formalizes Collective Intelligence theory, thus connecting distributed optimization to distributed reinforcement learning (RL). This paper provides an overview of PD theory and details a derived optimization algorithm. The approach is demonstrated on unconstrained optimization problems with discrete variables and with continuous variables. Results are compared with those obtained using distributed RL-inspired optimization approaches. [*Int'l Conf. on Complex Systems,* Boston, May 16-21, 2004.]

Fleet Assignment Using Collective Intelligence

Nicolas E. Antoine, Stefan Bieniawski, David H. Wolpert, and Ilan Kroo

Product Distribution (PD) theory is a new "collective intelligence"-based framework for analyzing and controlling distributed systems. Its usefulness in distributed stochastic optimization is illustrated here through an airline fleet assignment problem. This problem involves the allocation of aircraft to a set of flights legs in order to meet passenger demand while satisfying a variety of linear and nonlinear constraints. Over the course of the day, the routings are chosen to minimize the number of required flights. Flow continuity and aircraft count constraints have led researchers to focus on quasi-optimal solutions, especially at larger scales. We apply a PD-based stochastic optimization algorithm to a nonlinear objective, "cold start" fleet assignment problem, successfully solving problems with 130 variables and 184 constraints. [Extended version, from the *42nd AIAA Aerospace Sciences Meeting,* Reno, Jan. 5-8, 2004.]

Distributed Constrained Optimization

William Macready and David H. Wolpert

Product Distribution theory for design of distributed solutions to constrained optimization problems. The optimization is via an extension of game theory to agents with bounded rationality. An annealing algorithm is used to minimize the Lagrangian of the probability distribution of the agents� joint state. Computer experiments are presented for the k-SAT constraint-satisfaction problem and for unconstrained minimization of NK functions. [*Int'l Conf. on Complex Systems 2004,* Y. Bar-Yam (ed.), Perseus books, 2004.]

For slide presentations, see our Presentations page. For additional theory and application papers, see David Wolpert's Science of Collectives page or the Stanford Probability Collectives page.

**Principal Investigator**

David Wolpert

**Other Contributors**

Nicolas E. Antoine (Airbus/Stanford)

Stefan Bieniawski (Stanford)

Ilan Kroo (Stanford)

Chiu Fan Lee (Oxford)

Bill Macready (Dwave Systems/NASA)

Dave Nicholson (BAE Systems)

Dev Rajnarayan (Stanford)

Charlie E. M. Strauss (LANL)

David Wolpert's Home Page

PhD Thesis on Flutter Suppression

Stanford Probability Collectives Site

Collectives Workshop, 2003

Stanford AA Design Group