Discounted Reinforcement Learning Does Not Scale.

Matthew A. F. McDonald Philip Hingston

Published in: Comput. Intell. (1997)

Keyphrases

reinforcement learning
markov decision processes
optimal policy
function approximation
markov decision process
dynamic programming
state space
average reward
model free
reinforcement learning algorithms
learning process
action space
temporal difference learning
finite state
infinite horizon
reinforcement learning methods
cash flow
data sets
robotic control
temporal difference
dynamical systems
scale space
multi agent systems
multi agent
genetic algorithm
real world