Cavazos-Cadena, Rolando; Cavazos-Cadena, Rolando - In: Mathematical Methods of Operations Research 56 (2002) 2, pp. 181-196
This work concerns finte-state Markov decision chains endowed with the long-run average reward criterion. Assuming that the optimality equation has a solution, it is shown that a nearly optimal stationary policy, as well as an approximation to the optimal average reward within a specified error,...