Login / Signup
Gradient Q(σ, λ): A Unified Algorithm with Function Approximation for Reinforcement Learning.
Long Yang
Yu Zhang
Qian Zheng
Pengfei Li
Gang Pan
Published in:
CoRR (2019)
Keyphrases
</>
function approximation
reinforcement learning
mountain car
model free
dynamic programming
learning algorithm
temporal difference learning
temporal difference
policy gradient
function approximators
neural network
convergence rate
actor critic
search space
gradient method
td learning
markov decision processes
transfer learning
step size