Login / Signup
Sample-Efficient Reinforcement Learning Is Feasible for Linearly Realizable MDPs with Limited Revisiting.
Gen Li
Yuxin Chen
Yuejie Chi
Yuantao Gu
Yuting Wei
Published in:
CoRR (2021)
Keyphrases
</>
reinforcement learning
markov decision processes
state space
optimal policy
function approximation
machine learning
learning algorithm
objective function
temporal difference
reinforcement learning algorithms
markov decision process
limited memory
average reward
markov decision problems