Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning.

Kristopher De Asis Alan Chan Silviu Pitis Richard S. Sutton Daniel Graves

Published in: CoRR (2019)

Keyphrases

temporal difference methods
reinforcement learning
temporal difference
function approximation
policy search
function approximators
evolutionary methods
policy evaluation
reinforcement learning problems
reinforcement learning algorithms
td learning
least squares
evolutionary algorithm
evolutionary computation
model free
step size
machine learning
cost function
td methods
action selection
neural network
policy iteration
markov decision processes
transfer learning
particle swarm optimization
state space
learning process