Login / Signup
Behavior Alignment via Reward Function Optimization.
Dhawal Gupta
Yash Chandak
Scott M. Jordan
Philip S. Thomas
Bruno C. da Silva
Published in:
NeurIPS (2023)
Keyphrases
</>
reward function
inverse reinforcement learning
markov decision processes
reinforcement learning
reinforcement learning algorithms
state space
autonomous robots
minimax regret