Invariance in Policy Optimisation and Partial Identifiability in Reward Learning.
Joar SkalseMatthew Farrugia-RobertsStuart RussellAlessandro AbateAdam GleavePublished in: CoRR (2022)
Keyphrases
- reinforcement learning
- learning algorithm
- learning process
- learning systems
- unsupervised learning
- inverse reinforcement learning
- knowledge acquisition
- active learning
- prior knowledge
- inductive inference
- action selection
- supervised learning
- optimal policy
- learning tasks
- learning problems
- learning analytics
- training data