C
search
search
reviewers
reviewers
feeds
feeds
assignments
assignments
settings
logout
Iterative Reward Shaping Using Human Feedback for Correcting Reward Misspecification.
Jasmina Gajcin
James McCarthy
Rahul Nair
Radu Marinescu
Elizabeth Daly
Ivana Dusparic
Published in:
ECAI (2023)
Keyphrases
</>
reward shaping
reinforcement learning
complex domains
monte carlo
reinforcement learning algorithms
learning algorithm
learning process
probability distribution
state space
human subjects
reward function
markov decision problems