Login / Signup
Hindsight Balanced Reward Shaping.
Mengxuan Shao
Feng Jiang
Shaohui Liu
Kun Han
Debin Zhao
Published in:
ICONIP (5) (2022)
Keyphrases
</>
reward shaping
reinforcement learning
complex domains
reinforcement learning algorithms
state space
function approximation
markov decision problems
machine learning
training data
optimal policy
action selection