Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning.

Kristopher De Asis Alan Chan Silviu Pitis Richard S. Sutton Daniel Graves

Published in: AAAI (2020)

Keyphrases

temporal difference methods
reinforcement learning
function approximation
temporal difference
function approximators
policy search
policy evaluation
evolutionary methods
reinforcement learning problems
reinforcement learning algorithms
model free
td learning
evaluation function
action selection
supervised learning
neural network
markov decision processes
monte carlo
least squares
machine learning
decision making
control problems
reward function
differential evolution
evolutionary computation
semi supervised
multi agent