C
search
search
reviewers
reviewers
feeds
feeds
assignments
assignments
settings
logout
Iterative Reward Shaping using Human Feedback for Correcting Reward Misspecification.
Jasmina Gajcin
James McCarthy
Rahul Nair
Radu Marinescu
Elizabeth Daly
Ivana Dusparic
Published in:
CoRR (2023)
Keyphrases
</>
reward shaping
reinforcement learning
complex domains
human subjects
markov decision problems
state space
dynamic programming
linear programming
human experts
long run
reward function
reinforcement learning algorithms