Liu, Ke; Filar, Jerzy A. - In: Mathematical Methods of Operations Research 53 (2001) 3, pp. 465-480
In this paper we consider the weighted reward Markov decision process, with perturbation. The “weighted reward” refers to appropriately normalized convex combination of the discounted and the long-run average reward criteria. This criterion allows the controller to trade-off short-term costs...