Bayesian Learning of Noisy Markov Decision Processes

This work addresses the problem of estimating the optimal value function in a MarkovDecision Process from observed state-action pairs. We adopt a Bayesian approach toinference, which allows both the model to be estimated and predictions about actions tobe made in a unified framework, providing a principled approach to mimicry of a controlleron the basis of observed data. A new Markov chain Monte Carlo (MCMC) sampler isdevised for simulation from the posterior distribution over the optimal value function.This step includes a parameter expansion step, which is shown to be essential for goodconvergence properties of the MCMC sampler. As an illustration, the method is appliedto learning a human controller.

MoreLess

Year of publication:	2010
Authors:	Singh, Sumeetpal S. ; Chopin, Nicolas ; Whiteley, Nick
Institutions:	Centre de Recherche en Économie et Statistique (CREST), Groupe des Écoles Nationales d'Économie et Statistique (GENES)

freely available

Full text |

More access options

Check Google Scholar

In German libraries (KVK)

I need help

More details

Extent:	application/pdf
Series:	Working Papers.
Type of publication:	Book / Working Paper
Notes:	Number 2010-36 3 pages long
Source:	RePEc - Research Papers in Economics

Persistent link: https://www.econbiz.de/10008838811