Login / Signup
Deep PQR: Solving Inverse Reinforcement Learning using Anchor Actions.
Sinong Geng
Houssam Nassif
Carlos A. Manzanares
A. Max Reppen
Ronnie Sircar
Published in:
ICML (2020)
Keyphrases
</>
inverse reinforcement learning
reward function
partially observable environments
bayesian nonparametric
partially observable
preference elicitation
reinforcement learning
state space
np hard
markov decision processes
situation calculus
probability distribution
linear program
decision theoretic
simple examples