Login / Signup
Invariance in Policy Optimisation and Partial Identifiability in Reward Learning.
Joar Max Viktor Skalse
Matthew Farrugia-Roberts
Stuart Russell
Alessandro Abate
Adam Gleave
Published in:
ICML (2023)
Keyphrases
</>
reinforcement learning
learning process
active learning
partially observable environments
learning systems
learning tasks
inverse reinforcement learning
optimal policy
learning scenarios
learning algorithm
state space
unsupervised learning
action selection