Online Inverse Reinforcement Learning via Bellman Gradient Iteration.

Kun Li Joel W. Burdick

Published in: CoRR (2017)

Keyphrases

inverse reinforcement learning
bayesian nonparametric
partially observable environments
preference elicitation
multi agent
theoretical framework
objective function
control system
resource allocation
linear program