Mannor, Shie; Simester, Duncan; Sun, Peng; Tsitsiklis, … - In: Management Science 53 (2007) 2, pp. 308-322
We consider a finite-state, finite-action, infinite-horizon, discounted reward Markov decision process and study the bias and variance in the value function estimates that result from empirical estimates of the model parameters. We provide closed-form approximations for the bias and variance,...