Walking the Values in Bayesian Inverse Reinforcement Learning.

Ondrej Bajgar Alessandro Abate Konstantinos Gatsis Michael A. Osborne

Published in: CoRR (2024)

Keyphrases

inverse reinforcement learning
bayesian nonparametric
partially observable environments
preference elicitation
maximum likelihood
attribute values
posterior probability
bayesian networks
temporal difference
graphical models
decision theory
reward function