Informatics Report Series


Report   

EDI-INF-RR-0934


Related Pages

Report (by Number) Index
Report (by Date) Index
Author Index
Institute Index

Home
Title:Probabilistic inference for solving (PO)MDP's
Authors: Marc Toussaint ; Stefan Harmeling ; Amos Storkey
Date:Dec 2006
Abstract:
The development of probabilistic inference techniques has made considerable progress in recent years, in particular with respect to exploiting the structure (e.g., factored, hierarchical or relational) of discrete and continuous problem domains. We show that these techniques can be used also for solving Markov Decision Processes (MDPs) or partial observable MDPs (POMDPs) when formulated in terms of a structured dynamic Bayesian network (DBN). The approach is based on an equivalence between maximization of the expected future return in the time-unlimited MDP and likelihood maximization in a related mixture of finite-time MDPs. This allows us to use expectation maximization (EM) for computing optimal policies, using arbitrary inference techniques in the E-step. Unlike previous approaches we can show that this actually optimizes the discounted expected future return for arbitrary reward functions and without assuming an ad hoc finite total time. We first develop the approach for standard MDPs and demonstrate it using exact inference on a discrete maze and Gaussian belief state propagation in non-linear stochastic optimal control problems. Then we present an extension for solving POMDPs. We consider an agent model that includes an internal memory variable used for gating reactive behaviors. Using exact inference on the respective DBN, the EM-algorithm solves complex maze problems by learning appropriate internal memory representations.
Links To Paper
Pdf
Bibtex format
@Misc{EDI-INF-RR-0934,
author = { Marc Toussaint and Stefan Harmeling and Amos Storkey },
title = {Probabilistic inference for solving (PO)MDP's},
year = 2006,
month = {Dec},
url = {http://www.anc.ed.ac.uk/~amos/publications/ToussaintetAl2006ProbabilisticInferenceForSolvingPOMDPs.pdf},
}


Home : Publications : Report 

Please mail <reports@inf.ed.ac.uk> with any changes or corrections.
Unless explicitly stated otherwise, all material is copyright The University of Edinburgh