Reinforcement Learning and Inverse Reinforcement Learning with System 1 and System 2.
Alexander PeysakhovichPublished in: AIES (2019)
Keyphrases
- inverse reinforcement learning
- partially observable environments
- reinforcement learning
- reward function
- temporal difference
- bayesian nonparametric
- reinforcement learning algorithms
- preference elicitation
- state space
- markov decision processes
- function approximation
- markov decision process
- model free
- partially observable
- optimal policy
- approximate dynamic programming
- action selection
- learning algorithm