Showing 1 - 2 of 2
Eligibility traces have been shown to speed reinforcement learning, to make it more robust to hidden states, and to provide a link between Monte Carlo and temporal-difference methods. Here we generalize eligibility traces to off-policy learning, in which one learns about a policy different from...
Persistent link: https://www.econbiz.de/10009468145
Persistent link: https://www.econbiz.de/10012039988