Ordinal Dynamic Programming
Numerically valued reward processes are found in most dynamic programming models. Mitten, however, recently formulated finite horizon sequential decision processes in which a real-valued reward need not be earned at each stage. Instead of the cardinality assumption implicit in past models, Mitten assumes that a decision maker has a preference order over a general collection of outcomes (which need not be numerically valued). This paper investigates infinite horizon ordinal dynamic programming models. Both deterministic and stochastic models are considered. It is shown that an optimal policy exists if and only if some stationary policy is optimal. Moreover, "policy improvement" leads to better policies using either Howard-Blackwell or Eaton-Zadeh procedures. The results illuminate the roles played by various sets of assumptions in the literature on Markovian decision processes.
Year of publication: |
1975
|
---|---|
Authors: | Sobel, Matthew J. |
Published in: |
Management Science. - Institute for Operations Research and the Management Sciences - INFORMS, ISSN 0025-1909. - Vol. 21.1975, 9, p. 967-975
|
Publisher: |
Institute for Operations Research and the Management Sciences - INFORMS |
Saved in:
Online Resource
Saved in favorites
Similar items by person
-
Scheduling projects with stochastic activity duration to maximize expected net present value
Sobel, Matthew J., (2009)
-
Capital accumulation and the optimization of renewable resource models
Mendelssohn, Roy, (1980)
-
Risk-Sensitive Dynamic Market Share Attraction Games
Monahan, George E., (1997)
- More ...