LS-IQ: Implicit Reward Regularization for Inverse Reinforcement Learning.

Firas Al-Hafez Davide Tateo Oleg Arenz Guoping Zhao Jan Peters

Published in: ICLR (2023)

Keyphrases

inverse reinforcement learning
bayesian nonparametric
partially observable environments
reward function
preference elicitation
temporal difference
reinforcement learning algorithms
state space
fuzzy logic
resource allocation
utility function
partial order
multiple agents