Solving POMDPs Using Quadratically Constrained Linear Programs
Christopher Amato
Daniel S. Bernstein
Shlomo Zilberstein
Abstract
Developing scalable algorithms for solving partially
observable Markov decision processes
(POMDPs) is an important challenge. One approach
that effectively addresses the intractable
memory requirements of POMDP algorithms is
based on representing POMDP policies as finite-state
controllers. In this paper, we illustrate some
fundamental disadvantages of existing techniques
that use controllers. We then propose a new
approach that formulates the problem as a quadratically
constrained linear program (QCLP), which
defines an optimal controller of a desired size. This
representation allows a wide range of powerful
nonlinear programming algorithms to be used to
solve POMDPs. Although QCLP optimization
techniques guarantee only local optimality, the
results we obtain using an existing optimization
method show significant solution improvement
over the state-of-the-art techniques. The results
open up promising research directions for solving
large POMDPs using nonlinear programming
methods.
Download
[pdf]