C
search
search
reviewers
reviewers
feeds
feeds
assignments
assignments
settings
logout
Behavior Alignment via Reward Function Optimization.
Dhawal Gupta
Yash Chandak
Scott M. Jordan
Philip S. Thomas
Bruno Castro da Silva
Published in:
CoRR (2023)
Keyphrases
</>
reward function
inverse reinforcement learning
state space
markov decision processes
reinforcement learning algorithms
reinforcement learning
machine learning
prior knowledge
optimal policy
state variables
complex systems