Login / Signup
Iterative Reward Shaping Using Human Feedback for Correcting Reward Misspecification.
Jasmina Gajcin
James McCarthy
Rahul Nair
Radu Marinescu
Elizabeth Daly
Ivana Dusparic
Published in:
ECAI (2023)
Keyphrases
</>
reward shaping
reinforcement learning
complex domains
monte carlo
reinforcement learning algorithms
learning algorithm
learning process
probability distribution
state space
human subjects
reward function
markov decision problems