SOTAVerified

Reinforcement with Fading Memories

2019-07-29Unverified0· sign in to hype

Kuang Xu, Se-Young Yun

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

We study the effect of imperfect memory on decision making in the context of a stochastic sequential action-reward problem. An agent chooses a sequence of actions which generate discrete rewards at different rates. She is allowed to make new choices at rate , while past rewards disappear from her memory at rate . We focus on a family of decision rules where the agent makes a new choice by randomly selecting an action with a probability approximately proportional to the amount of past rewards associated with each action in her memory. We provide closed-form formulae for the agent's steady-state choice distribution in the regime where the memory span is large ( 0), and show that the agent's success critically depends on how quickly she updates her choices relative to the speed of memory decay. If , the agent almost always chooses the best action, i.e., the one with the highest reward rate. Conversely, if , the agent chooses an action with a probability roughly proportional to its reward rate.

Tasks

Reproductions