Dai, Min; Dong, Yuchao; Jia, Yanwei - 2021
We study a dynamic mean-variance portfolio optimization problem under the reinforcement learning framework, where an entropy regularizer is introduced to induce exploration. Due to the time-inconsistency involved in a mean-variance criterion, we aim to learn an equilibrium strategy. Under an...