Search Results - subject:"reward"

New sufficient conditions for average optimality in continuous-time Markov decision processes

Ye, Liuer; Guo, Xianping - In: Computational Statistics 72 (2010) 1, pp. 75-94

the long-run expected average reward criterion. The transition rates of the underlying continuous-time Markov processes … are allowed to be unbounded, and the reward rates may have neither upper nor lower bounds. We provide new sufficient …

Persistent link: https://www.econbiz.de/10010759494

Markov control processes with pathwise constraints

Mendoza-Pérez, Armando; Hernández-Lerma, Onésimo - In: Computational Statistics 71 (2010) 3, pp. 477-502

be optimized is a long-run sample-path (or pathwise) average reward subject to constraints on a long-run pathwise average …

Persistent link: https://www.econbiz.de/10010847903

Sample-path optimality and variance-maximization for Markov decision processes

Zhu, Q. - In: Computational Statistics 65 (2007) 3, pp. 519-538

This paper studies both the average sample-path reward (ASPR) criterion and the limiting average variance criterion for …

Persistent link: https://www.econbiz.de/10010759507

Variance minimization and the overtaking optimality approach to continuous-time controlled Markov chains

Prieto-Rumeau, Tomás; Hernández-Lerma, Onésimo - In: Computational Statistics 70 (2009) 3, pp. 527-540

reward rates. It concerns optimality criteria that improve the usual expected average reward criterion. First, we show the … existence of average reward optimal policies with minimal average variance. Then we compare the variance minimization criterion …

Persistent link: https://www.econbiz.de/10010759505

Constrained continuous-time Markov decision processes with average criteria

Zhang, Lanlan; Guo, Xianping - In: Computational Statistics 67 (2008) 2, pp. 323-340

unbounded reward/cost and transition rates. The criterion to be maximized is the expected average reward, and a constraint is …

Persistent link: https://www.econbiz.de/10010759169

On mean reward variance in semi-Markov processes

Sladký, Karel - In: Computational Statistics 62 (2005) 3, pp. 387-397

As an extension of the discrete-time case, this note investigates the variance of the total cumulative reward for the …

Persistent link: https://www.econbiz.de/10010847602

The Laurent series, sensitive discount and Blackwell optimality for continuous-time controlled Markov chains

Prieto-Rumeau, Tomás; Hernández-Lerma, Onésimo - In: Computational Statistics 61 (2005) 1, pp. 123-145

-time controlled Markov chains with possibly unbounded reward (or cost) rates and unbounded transition rates. That series is then used …

Persistent link: https://www.econbiz.de/10010847972

Semi-infinite weighted Markov decision processes with perturbation

Abbad, Mohammed; Rahhali, Khalid - In: Computational Statistics 60 (2004) 2, pp. 251-265

In this paper, Weighted reward Perturbed Markov Decision Processes with finite state and countable action spaces (semi …-infinite WMDP for short) are considered. The ”weighted reward” refers to appropriately normalized convex combination of the … discounted and the long-run average reward criteria. This criterion allows the controller to trade-off short-term rewards versus …

Persistent link: https://www.econbiz.de/10010759177

Stability estimates in the problem of average optimal switching of a Markov chain

Gordienko, Evgueni; Yushkevich, Alexander - In: Computational Statistics 57 (2003) 3, pp. 345-365

We consider a switching model for a Markov chain x t with a transition probability p(x|B). The goal of a controller is to maximize the average gain by selecting a sequence of stopping times, in which the controller gets rewards and pays costs (depending on x t ) in an alternating order. We...

Persistent link: https://www.econbiz.de/10010759201

The critical discount factor for finite Markovian decision processes with an absorbing set

Hinderer, K.; Waldmann, K.-H. - In: Computational Statistics 57 (2003) 1, pp. 1-19

-stage value function V N for N →∞ exists and is finite for each choice of the one-stage reward function. Several representations …

Persistent link: https://www.econbiz.de/10010847666