cumulative reward. When the environment is partially observable, the agent cannot determine the states with certainty. These … Information Gain Ratio ranking with respect to the cumulative expected reward.Experiments carried out on three different RL tasks …