Login / Signup
Iterative Reward Shaping using Human Feedback for Correcting Reward Misspecification.
Jasmina Gajcin
James McCarthy
Rahul Nair
Radu Marinescu
Elizabeth Daly
Ivana Dusparic
Published in:
CoRR (2023)
Keyphrases
</>
reward shaping
reinforcement learning
complex domains
human subjects
markov decision problems
state space
dynamic programming
linear programming
human experts
long run
reward function
reinforcement learning algorithms