Showing 1 - 4 of 4
This paper addresses the multi-armed bandit problem with switching penalties including both costs and delays, extending results of the companion paper [J. Niño-Mora. "Two-Stage Index Computation for Bandits with Switching Penalties I: Switching Costs". Conditionally accepted at INFORMS J....
Persistent link: https://www.econbiz.de/10005249610
In this paper we consider the problem of admission control of Bernoulli arrivals to a buffer with geometric server, in which the controller’s actions take effect one period after the actual change in the queue length. An optimal policy in terms of marginal productivity indices (MPI) is derived...
Persistent link: https://www.econbiz.de/10005249624
The Whittle index [P. Whittle (1988). Restless bandits: Activity allocation in a changing world. J. Appl. Probab. 25A, 287-298] yields a practical scheduling rule for the versatile yet intractable multi-armed restless bandit problem, involving the optimal dynamic priority allocation to multiple...
Persistent link: https://www.econbiz.de/10005249631
This paper addresses the multi-armed bandit problem with switching costs. Asawa and Teneketzis (1996) introduced an index that partly characterizes optimal policies, attaching to each bandit state a "continuation index" (its Gittins index) and a "switching index". They proposed to jointly...
Persistent link: https://www.econbiz.de/10005249647