The Golden Rule
Instead of finding x to optimize a function,
find a distribution over x that optimizes
an associated expectation value.
(This transforms optimization problems over discrete or hybrid domains to problems over continuous domains.)
Advantages
- Works for arbitrary (mixed) data types x. P(x) is always a vector of real numbers, no matter what data type x is.
- Leverages techniques for the optimization of Euclidean vectors — the most powerful optimization techniques we have.
("Gradient descent for symbolic variables.")
- P(x) provides sensitivity information (i.e., which components of x are most important).
- Can be "seeded" with solutions of other algorithms: peaks of initial P(x).
- Can include Bayesian prior knowledge.
- Automatically accommodates noisy, poorly modeled problems.
Deep connections with statistical physics and game theory. So...
- Especially suited for distributed domains.
- Especially suited for very large problems.