Solving POMDPs Using Quadratically Constrained Linear Programs
Christopher Amato
Daniel S. Bernstein
Shlomo Zilberstein
Abstract
Developing scalable algorithms for solving partially observable
Markov decision processes (POMDPs) is an important
challenge. One promising approach is based on representing
POMDP policies as finite-state controllers. This method has
been used successfully to address the intractable memory requirements
of POMDP algorithms. We illustrate some fundamental
theoretical limitations of existing techniques that
use controllers. We then propose a new approach that formulates
the problem as a quadratically constrained linear
program (QCLP), the solution of which provides an optimal
controller of a desired size. We evaluate several optimization
methods for solving QCLPs and compare their performance
with existing POMDP optimization methods. While the optimization
algorithms used in this paper only guarantee locally
optimal solutions, the results show consistent improvement
of solution quality over the state-of-the-art techniques.
The results show that powerful nonlinear programming algorithms
can be used effectively to improve the performance
and scalability of POMDP algorithms.
Download
[pdf]