Saghafian, Soroush - 2021 - Version: December 8, 2021
dynamics and causal effects on observed variables. Using this connection, we develop two Reinforcement Learning methods termed … Direct Augmented V-Learning (DAV-Learning) and Safe Augmented V-Learning (SAV-Learning), which enable using the observed data … to efficiently learn an optimal treatment regime. We establish theoretical results for these learning methods, including …