Similar Search Results

Value iteration and approximately optimal stationary policies in finite-state average Markov decision chains

Cavazos-Cadena, Rolando; Cavazos-Cadena, Rolando - In: Mathematical Methods of Operations Research 56 (2002) 2, pp. 181-196

This work concerns finte-state Markov decision chains endowed with the long-run average reward criterion. Assuming that the optimality equation has a solution, it is shown that a nearly optimal stationary policy, as well as an approximation to the optimal average reward within a specified error,...

Persistent link: https://www.econbiz.de/10010950234

Value iteration and approximately optimal stationary policies in finite-state average Markov decision chains

Cavazos-Cadena, Rolando; Cavazos-Cadena, Rolando - In: Computational Statistics 56 (2002) 2, pp. 181-196

Persistent link: https://www.econbiz.de/10010759438

Adaptive control of average Markov decision chains under the Lyapunov stability condition

Cavazos-Cadena, Rolando - In: Mathematical Methods of Operations Research 54 (2001) 1, pp. 63-99

This note concerns discrete-time Markov decision processes with denumerable state space. A control policy is graded by the long-run expected average reward criterion, and the main feature of the model is that the reward function and the transition law depend on an unknown parameter. Besides...

Persistent link: https://www.econbiz.de/10010950138

Adaptive control of average Markov decision chains under the Lyapunov stability condition

Cavazos-Cadena, Rolando - In: Computational Statistics 54 (2001) 1, pp. 63-99

Persistent link: https://www.econbiz.de/10010759346