Memory-Bounded Dynamic Programming for DEC-POMDPs
Sven Seuken
Shlomo Zilberstein
Abstract
Decentralized decision making under uncertainty
has been shown to be intractable when each agent
has different partial information about the domain.
Thus, improving the applicability and scalability
of planning algorithms is an important challenge.
We present the first memory-bounded dynamic programming
algorithm for finite-horizon decentralized
POMDPs. A set of heuristics is used to identify
relevant points of the infinitely large belief
space. Using these belief points, the algorithm
successively selects the best joint policies for each
horizon. The algorithm is extremely efficient, having
linear time and space complexity with respect
to the horizon length. Experimental results show
that it can handle horizons that are multiple orders
of magnitude larger than what was previously possible,
while achieving the same or better solution
quality. These results significantly increase the applicability
of decentralized decision-making techniques.
Download
[pdf]