Temporal Difference Methods for the Variance of the Reward To Go.

Aviv Tamar Dotan Di Castro Shie Mannor

Published in: ICML (3) (2013)

Keyphrases

temporal difference methods
reinforcement learning
function approximation
temporal difference
policy search
reinforcement learning problems
evolutionary methods
policy evaluation
reinforcement learning algorithms
function approximators
reward function
variance reduction
long run
policy gradient
evolutionary computation
optimization algorithm
neural network
average reward
optimal control
decision making
machine learning