Similar Search Results

Adaptive control of average Markov decision chains under the Lyapunov stability condition

Cavazos-Cadena, Rolando - In: Mathematical Methods of Operations Research 54 (2001) 1, pp. 63-99

This note concerns discrete-time Markov decision processes with denumerable state space. A control policy is graded by the long-run expected average reward criterion, and the main feature of the model is that the reward function and the transition law depend on an unknown parameter. Besides...

Persistent link: https://www.econbiz.de/10010950138

Adaptive control of average Markov decision chains under the Lyapunov stability condition

Cavazos-Cadena, Rolando - In: Computational Statistics 54 (2001) 1, pp. 63-99

Persistent link: https://www.econbiz.de/10010759346