On Double Descent in Reinforcement Learning with LSTD and Random Features.

David Brellmann Eloïse Berthier David Filliat Goran Frehse

Published in: ICLR (2024)

Keyphrases

reinforcement learning
temporal difference
function approximation
state space
temporal difference learning
model free
least squares
control problems
reinforcement learning algorithms
multi agent
policy evaluation
optimal control
optimal policy
evaluation function
supervised learning
markov decision processes
regression model
markov chain
policy iteration
function approximators