Exploration-Guided Reward Shaping for Reinforcement Learning under Sparse Rewards.

Rati Devidze Parameswaran Kamalaruban Adish Singla

Published in: NeurIPS (2022)

Keyphrases

reward shaping
reinforcement learning
reinforcement learning algorithms
complex domains
action selection
state space
function approximation
markov decision problems
multi agent
model free
dynamic programming
machine learning
optimal control
learning algorithm
neural network
temporal difference
objective function
learning agent
function approximators
continuous state