Deep PQR: Solving Inverse Reinforcement Learning using Anchor Actions.

Sinong Geng Houssam Nassif Carlos A. Manzanares A. Max Reppen Ronnie Sircar

Published in: ICML (2020)

Keyphrases

inverse reinforcement learning
reward function
partially observable environments
bayesian nonparametric
partially observable
preference elicitation
reinforcement learning
state space
np hard
markov decision processes
situation calculus
probability distribution
linear program
decision theoretic
simple examples