Login / Signup
On Double Descent in Reinforcement Learning with LSTD and Random Features.
David Brellmann
Eloïse Berthier
David Filliat
Goran Frehse
Published in:
CoRR (2023)
Keyphrases
</>
reinforcement learning
temporal difference
function approximation
state space
learning algorithm
policy evaluation
optimal policy
markov decision processes
temporal difference learning
least squares
monte carlo
model free
multi agent
control problems
td learning
action selection
reinforcement learning methods