SIBRE: Self Improvement Based REwards for Reinforcement Learning.

Somjit Nath Richa Verma Abhik Ray Harshad Khadilkar

Published in: CoRR (2020)

Keyphrases

reinforcement learning
markov decision processes
state space
reinforcement learning algorithms
temporal difference
function approximation
reward function
optimal policy
reward shaping
model free
neural network
machine learning
sufficient conditions
robotic control
temporal difference learning
learning algorithm
optimal control
learning problems
significant improvement
transfer learning
least squares
evolutionary algorithm
learning process
action selection
multi agent
function approximators
control policy
reinforcement learning methods
database
supervised learning