I analyze the effects of memory limitations on the endogenous learning behavior of an agent in a standard two-armed bandit problem. An infinitely lived agent chooses each period between two alternatives with unknown types, to maximize discounted payoffs. The agent can experiment with each...