Similar Search Results

Delete all filters | 1 applied filter

A Structured Multiarmed Bandit Problem and the Greedy Policy

Rusmevichientong, Paat; Mersereau, Adam J.; Tsitsiklis, … - 2009

We consider a multiarmed bandit problem where the expected reward of each arm is a linear function of an unknown scalar with a prior distribution. The objective is to choose a sequence of arms that maximizes the expected total (or discounted total) reward. We demonstrate the effectiveness of a...

Persistent link: https://www.econbiz.de/10009432173