Revisiting Peng's Q(λ) for Modern Reinforcement Learning.

Tadashi Kozuno Yunhao Tang Mark Rowland Rémi Munos Steven Kapturowski Will Dabney Michal Valko David Abel

Published in: CoRR (2021)

Keyphrases

reinforcement learning
function approximation
reinforcement learning algorithms
state space
optimal policy
reinforcement learning methods
learning algorithm
data mining
temporal difference learning
model free
optimal control
dynamic programming
learning process
multi agent
supervised learning
markov decision processes
website
artificial intelligence
control problems
machine learning
neural network
multi agent reinforcement learning
policy search
relational reinforcement learning