Inverse Reinforcement Learning without Reinforcement Learning.
Gokul SwamyDavid WuSanjiban ChoudhuryDrew BagnellZhiwei Steven WuPublished in: ICML (2023)
Keyphrases
- inverse reinforcement learning
- partially observable environments
- reinforcement learning
- reward function
- temporal difference
- bayesian nonparametric
- reinforcement learning algorithms
- preference elicitation
- markov decision processes
- function approximation
- model free
- partially observable
- state space
- optimal policy
- markov decision process
- multiple agents
- action selection
- supervised learning
- decision making
- artificial intelligence
- machine learning
- long run
- decision theory
- utility function
- partially observable markov decision processes
- control policies
- partial observability