Cartea, Álvaro; Drissi, Fayçal; Osselin, Pierre - 2023
uses a new contextual bandit algorithm, which we call MTGP-LR, to learn the reward functions that map features to trading … algorithm employs a new online change-point detection test to learn in non-stationary environments. As an application in optimal …