BANERJEE, BIKRAMJIT; KRAEMER, LANDON - In: Advances in Complex Systems (ACS) 14 (2011) 02, pp. 279-305
The design of reinforcement learning solutions to many problems artificially constrain the action set available to an agent, in order to limit the exploration/sample complexity. While exploring, if an agent can discover new actions that can break through the constraints of its basic/atomic...