Bubeck, Sébastien; Munos, Rémi; Stoltz, Gilles; … - HAL - 2011
We consider a generalization of stochastic bandits where the set of arms, $\cX$, is allowed to be a generic measurable space and the mean-payoff function is ''locally Lipschitz'' with respect to a dissimilarity function that is known to the decision maker. Under this condition we construct an...