Login / Signup
Specifying Behavior Preference with Tiered Reward Functions.
Zhiyuan Zhou
Henry Sowerby
Michael L. Littman
Published in:
CoRR (2022)
Keyphrases
</>
markov decision processes
reward function
state space
markov decision process
reinforcement learning
dynamic programming
simple examples
user preferences
machine learning
objective function
multi attribute
state transitions