Decentralized Multi-Agent Reinforcement Learning in Average-Reward Dynamic DCOPs

Duc Thien Nguyen, William Yeoh, Hoong Chuin Lau, Shlomo Zilberstein, and Chongjie Zhang. Decentralized Multi-Agent Reinforcement Learning in Average-Reward Dynamic DCOPs (Extended Abstract). Proceedings of the Thirteenth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 1341-1342, Paris, France, 2014.

Abstract

Researchers have introduced the Dynamic Distributed Constraint Optimization Problem (Dynamic DCOP) formulation to model dynamically changing multi-agent coordination problems, where a dynamic DCOP is a sequence of (static canonical) DCOPs, each partially different from the DCOP preceding it. Existing work typically assumes that the problem in each time step is decoupled from the problems in other time steps, which might not hold in some applications. In this paper, we introduce a new model, called Markovian Dynamic DCOPs (MD-DCOPs), where a DCOP is a function of the value assignments in the preceding DCOP. We also introduce a distributed reinforcement learning algorithm that balances exploration and exploitation to solve MD-DCOPs in an online manner.

Bibtex entry:

@inproceedings{NYLZZaamas14,
  author	= {Duc Thien Nguyen and William Yeoh and Hoong Chuin Lau and
                   Shlomo Zilberstein and Chongjie Zhang},
  title		= {Decentralized Multi-Agent Reinforcement Learning in
                   Average-Reward Dynamic DCOPs},
  booktitle     = {Proceedings of the Thirteenth International Conference on
                   Autonomous Agents and Multiagent Systems},
  year		= {2014},
  pages		= {1341-1342},
  address       = {Paris, France},
  url		= {http://rbr.cs.umass.edu/shlomo/papers/NYLZZaamas14.html}
}

shlomo@cs.umass.edu
UMass Amherst