Heuristic Policy Iteration for Infinite-Horizon Decentralized POMDPs

Christopher Amato and Shlomo Zilberstein. Heuristic Policy Iteration for Infinite-Horizon Decentralized POMDPs. Proceedings of the AAMAS Workshop on Multi-Agent Sequential Decision Making in Uncertain Domains, (MSDM), 1-15, Estoril, Portugal, 2008

Abstract

Decentralized POMDPs (DEC-POMDPs) offer a rich model for planning under uncertainty in multiagent settings. Improving the scalability of solution techniques is an important challenge. While an optimal algorithm has been developed for infinitehorizon DEC-POMDPs, it often requires an intractable amount of time and memory. To address this problem, we present a heuristic version of this algorithm. Our approach is able to use initial state information to decrease solution size and often increases solution quality over what is achievable by the optimal algorithm before resources are exhausted. Experimental results demonstrate that this heuristic approach is effective, producing higher values and more concise solutions in all three test domains.

Bibtex entry:

@inproceedings{AZmsdm08,
  author	= {Christopher Amato and Shlomo Zilberstein},
  title		= {Heuristic Policy Iteration for Infinite-Horizon Decentralized
                   {POMDP}s},
  booktitle     = {Proceedings of the {AAMAS} Workshop on Multi-Agent Sequential
                   Decision Making in Uncertain Domains},
  year		= {2008},
  pages		= {1-15},
  address       = {Estoril, Portugal},
  url		= {http://rbr.cs.umass.edu/shlomo/papers/AZmsdm08.html}
}

shlomo@cs.umass.edu
UMass Amherst