Login / Signup
Regularized Off-Policy TD-Learning.
Bo Liu
Sridhar Mahadevan
Ji Liu
Published in:
CoRR (2020)
Keyphrases
</>
td learning
temporal difference
evaluation function
function approximation
reinforcement learning
multi step
policy evaluation
least squares
reinforcement learning algorithms
evolutionary algorithm
monte carlo
model free
data mining
markov decision processes