Efficient Exploration of Reward Functions in Inverse Reinforcement Learning via Bayesian Optimization.
Sreejith BalakrishnanQuoc Phong NguyenBryan Kian Hsiang LowHarold SohPublished in: CoRR (2020)
Keyphrases
- inverse reinforcement learning
- reward function
- bayesian nonparametric
- state space
- preference elicitation
- markov decision processes
- reinforcement learning algorithms
- reinforcement learning
- partially observable
- simple examples
- optimal policy
- multiple agents
- transition probabilities
- policy search
- temporal difference
- decision problems
- machine learning
- state variables
- evaluation function
- control policies
- generative model
- maximum likelihood
- decision making