On the Power of Global Reward Signals in Reinforcement Learning.

Thomas Kemmerich Hans Kleine Büning

Published in: MATES (2011)

Keyphrases

reinforcement learning
reinforcement learning algorithms
eligibility traces
function approximation
partially observable environments
global information
state space
signal processing
reward function
temporal difference
model free
markov decision processes
dynamic programming
markov decision process
supervised learning
learning problems
power consumption
optimal control
learning capabilities
policy gradient
optimal policy
reward shaping
genetic algorithm