C
search
search
reviewers
reviewers
feeds
feeds
assignments
assignments
settings
logout
Addressing Sample Inefficiency and Reward Bias in Inverse Reinforcement Learning.
Ilya Kostrikov
Kumar Krishna Agrawal
Sergey Levine
Jonathan Tompson
Published in:
CoRR (2018)
Keyphrases
</>
inverse reinforcement learning
bayesian nonparametric
partially observable environments
reward function
preference elicitation
optimal policy
sample size
bayesian networks
reinforcement learning
markov decision processes
temporal difference
reinforcement learning algorithms
partially observable