Similar Search Results

Value iteration and approximately optimal stationary policies in finite-state average Markov decision chains

Cavazos-Cadena, Rolando; Cavazos-Cadena, Rolando - In: Computational Statistics 56 (2002) 2, pp. 181-196

This work concerns finte-state Markov decision chains endowed with the long-run average reward criterion. Assuming that the optimality equation has a solution, it is shown that a nearly optimal stationary policy, as well as an approximation to the optimal average reward within a specified error,...

Persistent link: https://www.econbiz.de/10010759438

Nearly optimal stationary policies in negative dynamic programming

Cavazos-Cadena, Rolando; Montes-De-Oca, Raúl - In: Computational Statistics 49 (1999) 3, pp. 441-456

This work concerns controlled Markov chains with denumerable state space and discrete time parameter. The reward function is assumed to be≤0 and the performance of a control policy is measured by the expected total-reward criterion. Within this context, sufficient conditions are given so that...

Persistent link: https://www.econbiz.de/10010847483

Solution to the risk-sensitive average optimality equation in communicating Markov decision chains with finite state space: An alternative approach

Cavazos-Cadena, Rolando; Hernández-Hernández, Daniel - In: Computational Statistics 56 (2003) 3, pp. 473-479

This note concerns Markov decision chains with finite state and action sets. The decision maker is assumed to be risk-averse with constant risk sensitive coefficient λ, and the performance of a control policy is measured by the risk-sensitive average cost criterion. In their seminal paper...

Persistent link: https://www.econbiz.de/10010759199

Optimality equations and inequalities in a class of risk-sensitive average cost Markov decision chains

Cavazos-Cadena, Rolando - In: Computational Statistics 71 (2010) 1, pp. 47-84

This note concerns controlled Markov chains on a denumerable sate space. The performance of a control policy is measured by the risk-sensitive average criterion, and it is assumed that (a) the simultaneous Doeblin condition holds, and (b) the system is communicating under the action of each...

Persistent link: https://www.econbiz.de/10010759218

Nearly optimal policies in risk-sensitive positive dynamic programming on discrete spaces

Cavazos-Cadena, Rolando; Montes-de-Oca, Raúl - In: Computational Statistics 52 (2000) 1, pp. 133-167

This note concerns Markov decision processes on a discrete state space. It is supposed that the reward function is nonnegative, and that the decision maker has a nonnull constant risk-sensitivity, which leads to grade random rewards via the expectation of an exponential utility function. The...

Persistent link: https://www.econbiz.de/10010759334

Adaptive control of average Markov decision chains under the Lyapunov stability condition

Cavazos-Cadena, Rolando - In: Computational Statistics 54 (2001) 1, pp. 63-99

This note concerns discrete-time Markov decision processes with denumerable state space. A control policy is graded by the long-run expected average reward criterion, and the main feature of the model is that the reward function and the transition law depend on an unknown parameter. Besides...

Persistent link: https://www.econbiz.de/10010759346

Solutions of the average cost optimality equation for finite Markov decision chains: risk-sensitive and risk-neutral criteria

Cavazos-Cadena, Rolando - In: Computational Statistics 70 (2009) 3, pp. 541-566

This work is concerned with controlled Markov chains with finite state and action spaces. It is assumed that the decision maker has an arbitrary but constant risk sensitivity coefficient, and that the performance of a control policy is measured by the long-run average cost criterion. Within this...

Persistent link: https://www.econbiz.de/10010759540

Controlled Markov chains with risk-sensitive criteria: Average cost, optimality equations, and optimal solutions

Cavazos-Cadena, Rolando; Fernández-Gaucherand, Emmanuel - In: Computational Statistics 49 (1999) 2, pp. 299-324

We study controlled Markov chains with denumerable state space and bounded costs per stage. A (long-run) risk-sensitive average cost criterion, associated to an exponential utility function with a constant risk sensitivity coefficient, is used as a performance measure. The main assumption on the...

Persistent link: https://www.econbiz.de/10010759565

Solution to the risk-sensitive average cost optimality equation in a class of Markov decision processes with finite state space

Cavazos-Cadena, Rolando - In: Computational Statistics 57 (2003) 2, pp. 263-285

This work concerns discrete-time Markov decision processes with finite state space and bounded costs per stage. The decision maker ranks random costs via the expectation of the utility function associated to a constant risk sensitivity coefficient, and the performance of a control policy is...

Persistent link: https://www.econbiz.de/10010759568