Showing 1 - 2 of 2
This note concerns discrete-time Markov decision processes with denumerable state space. A control policy is graded by the long-run expected average reward criterion, and the main feature of the model is that the reward function and the transition law depend on an unknown parameter. Besides...
Persistent link: https://www.econbiz.de/10010950138
This note concerns discrete-time Markov decision processes with denumerable state space. A control policy is graded by the long-run expected average reward criterion, and the main feature of the model is that the reward function and the transition law depend on an unknown parameter. Besides...
Persistent link: https://www.econbiz.de/10010759346