SIBRE: Self Improvement Based REwards for Reinforcement Learning.
Somjit NathRicha VermaAbhik RayHarshad KhadilkarPublished in: CoRR (2020)
Keyphrases
- reinforcement learning
- markov decision processes
- state space
- reinforcement learning algorithms
- temporal difference
- function approximation
- reward function
- optimal policy
- reward shaping
- model free
- neural network
- machine learning
- sufficient conditions
- robotic control
- temporal difference learning
- learning algorithm
- optimal control
- learning problems
- significant improvement
- transfer learning
- least squares
- evolutionary algorithm
- learning process
- action selection
- multi agent
- function approximators
- control policy
- reinforcement learning methods
- database
- supervised learning