Jia, Huiwen; Shi, Cong; Shen, Siqian - 2021
We consider a general class of multi-armed bandits (MAB) problems with sub-exponential rewards. This is primarily motivated by service systems with exponential inter-arrival and service distributions. It is well-known that the celebrated Upper Confidence Bound (UCB) algorithm can achieve tight...