Avoiding Wireheading with Value Reinforcement Learning.
Tom EverittMarcus HutterPublished in: AGI (2016)
Keyphrases
- reinforcement learning
- function approximation
- learning algorithm
- state space
- optimal policy
- reinforcement learning algorithms
- control problems
- least squares
- machine learning
- transfer learning
- temporal difference
- model free
- markov decision processes
- multi agent
- direct policy search
- transition model
- temporal difference learning
- evaluation function
- control policy
- relational reinforcement learning
- robot control
- learning capabilities
- action selection
- data sets
- learning problems
- learning process
- databases