Online learning of shaping rewards in reinforcement learning.
Marek GrzesDaniel KudenkoPublished in: Neural Networks (2010)
Keyphrases
- online learning
- reinforcement learning
- reward shaping
- reinforcement learning algorithms
- markov decision processes
- complex domains
- function approximation
- state space
- e learning
- computer mediated
- higher education
- distance education
- reward function
- model free
- online course
- blended learning
- markov decision problems
- distance learning
- optimal policy
- machine learning
- online algorithms
- dynamic programming
- learning algorithm
- data sets
- action selection
- temporal difference
- active learning
- function approximators
- hidden state
- online learning environments
- transition model
- neural network