Planning and Acting under Uncertainty Using Markov Decision Models


The Markov decision process (MDP), a model of sequential decision-making developed in operations research, allows reasoning about actions with uncertain outcomes in order to determine a course of action that maximizes expected utility. It has been adopted as a framework for artificial intelligence research in decision-theoretic planning and reinforcement learning. In adopting this model, AI researchers have adopted dynamic-programming algorithms developed in operations research for solving MDPs. A drawback of dynamic programming is that it finds a solution for all problem states. In this project we develop a heuristic search approach to solving MDPs that represents the problem space of an MDP as a search graph and uses a heuristic evaluation function to focus computation on the relevant problem states, that is, on states that are reachable from a given start state. Another line of work is aimed at studying the complexity of decentralized MDPs, which offer a formal framework for planning and coordination of multi-agent systems. The complexity of decentralized control makes it infeasible to solve the problem optimally. We are currently developing approximation algorithms for finding a set of policies to accomplish a global objective.

Related Publications


shlomo@cs.umass.edu