Wang, Xianjia; yang, zhipeng; Chen, Guici; Liu, Yanli - 2023
To overcome the contradiction between the requirement of knowing the optimal expected payoff of the subprocess after any state at any time and the inability to know in the actual decision process in the backward recursive method for solving Markov decision processes (MDP), this paper proposes a...