Similar Search Results

Nearly optimal stationary policies in negative dynamic programming

Cavazos-Cadena, Rolando; Montes-De-Oca, Raúl - In: Computational Statistics 49 (1999) 3, pp. 441-456

This work concerns controlled Markov chains with denumerable state space and discrete time parameter. The reward function is assumed to be≤0 and the performance of a control policy is measured by the expected total-reward criterion. Within this context, sufficient conditions are given so that...

Persistent link: https://www.econbiz.de/10010847483

Risk-sensitive capacity control in revenue management

Barz, C.; Waldmann, K. - In: Computational Statistics 65 (2007) 3, pp. 565-579

Both the static and the dynamic single-leg revenue management problem are studied from the perspective of a risk-averse decision maker. Structural results well-known from the risk-neutral case are extended to the risk-averse case on the basis of an exponential utility function. In particular,...

Persistent link: https://www.econbiz.de/10010847579

Compactness of the space of non-randomized policies in countable-state sequential decision processes

Chen, Richard; Feinberg, Eugene - In: Computational Statistics 71 (2010) 2, pp. 307-323

For sequential decision processes with countable state spaces, we prove compactness of the set of strategic measures corresponding to nonrandomized policies. For the Borel state case, this set may not be compact (Piunovskiy, Optimal control of random sequences in problems with constraints....

Persistent link: https://www.econbiz.de/10010847644

Weighted Markov decision processes with perturbation

Liu, Ke; Filar, Jerzy A. - In: Computational Statistics 53 (2001) 3, pp. 465-480

In this paper we consider the weighted reward Markov decision process, with perturbation. The “weighted reward” refers to appropriately normalized convex combination of the discounted and the long-run average reward criteria. This criterion allows the controller to trade-off short-term costs...

Persistent link: https://www.econbiz.de/10010847712

Multi-policy iteration with a distributed voting

Chang, Hyeong Soo - In: Computational Statistics 60 (2004) 2, pp. 299-310

We present a novel simulation-based algorithm, as an extension of the well-known policy iteration algorithm, by combining multi-policy improvement with a distributed simulation-based voting policy evaluation, for approximately solving Markov Decision Processes (MDPs) with infinite horizon...

Persistent link: https://www.econbiz.de/10010847739

Dynamic order replenishment policy in internet-based supply chains

Berman, Oded; Kim, Eungab - In: Computational Statistics 53 (2001) 3, pp. 371-390

We consider a problem of dynamic replenishment of parts in the supply chain consisting of single class of customers, company, and supplier. Customers request a service via the WEB-based ordering system and the company supports service using parts which are procured from the supplier. The...

Persistent link: https://www.econbiz.de/10010847765

The finiteness of the reward function and the optimal value function in Markov decision processes

Hu, Qiying; Xu, Chen - In: Computational Statistics 49 (1999) 2, pp. 255-266

This paper studies the discrete time Markov decision processes (MDP) with expected discounted total reward, where the state space is countable, the action space is measurable, the reward function is extended real-valued, and the discount rate may be any real number. Two conditions (GC) and (C)...

Persistent link: https://www.econbiz.de/10010847961

Algorithms for aggregated limiting average Markov decision problems

Abbad, Mohammed; Daoui, Cherki - In: Computational Statistics 53 (2001) 3, pp. 451-463

We consider discrete time Markov Decision Process (MDP) with finite state and action spaces under average reward optimality criterion. The decomposition theory, in Ross and Varadarajan [11], leads to a natural partition of the state space into strongly communicating classes and a set of states...

Persistent link: https://www.econbiz.de/10010847980

Accelerated modified policy iteration algorithms for Markov decision processes

Shlakhter, Oleksandr; Lee, Chi-Guhn - In: Computational Statistics 78 (2013) 1, pp. 61-76

We propose a new approach to accelerate the convergence of the modified policy iteration method for Markov decision processes with the total expected discounted reward. In the new policy iteration an additional operator is applied to the iterate generated by Markov operator, resulting in a...

Persistent link: https://www.econbiz.de/10010848004

Dynamic inventory strategies for profit maximization in a service facility with stochastic service, demand and lead time

Berman, Oded; Kim, Eungab - In: Computational Statistics 60 (2004) 3, pp. 497-521

This paper addresses an optimal inventory control in a supply chain in which customers arrive at a facility according to a Poisson process and the facility provides service which takes exponential amounts of time, using items supplied by an outside supplier with exponential lead time process....

Persistent link: https://www.econbiz.de/10010759172