On the Power of Global Reward Signals in Reinforcement Learning.
Thomas KemmerichHans Kleine BüningPublished in: MATES (2011)
Keyphrases
- reinforcement learning
- reinforcement learning algorithms
- eligibility traces
- function approximation
- partially observable environments
- global information
- state space
- signal processing
- reward function
- temporal difference
- model free
- markov decision processes
- dynamic programming
- markov decision process
- supervised learning
- learning problems
- power consumption
- optimal control
- learning capabilities
- policy gradient
- optimal policy
- reward shaping
- genetic algorithm