Average-Reward Decentralized Markov Decision Processes
Marek Petrik
Shlomo Zilberstein
Abstract
Formal analysis of decentralized decision making
has become a thriving research area in recent years,
producing a number of multi-agent extensions of
Markov decision processes. While much of the
work has focused on optimizing discounted cumulative
reward, optimizing average reward is sometimes
a more suitable criterion. We formalize a
class of such problems and analyze its characteristics,
showing that it is NP complete and that optimal
policies are deterministic. Our analysis lays the
foundation for designing two optimal algorithms.
Experimental results with a standard problem from
the literature illustrate the applicability of these solution
techniques.
Download
[pdf]