Login / Signup
OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning.
Peter Henderson
Wei-Di Chang
Pierre-Luc Bacon
David Meger
Joelle Pineau
Doina Precup
Published in:
CoRR (2017)
Keyphrases
</>
inverse reinforcement learning
partially observable environments
bayesian nonparametric
preference elicitation
reward function
learning algorithm
reinforcement learning
reinforcement learning algorithms
partially observable
prior knowledge
generative model
unsupervised learning
action selection
learning agent