Inverse Reinforcement Learning in Contextual MDPs.

Philip Korsunsky Stav Belogolovsky Tom Zahavy Chen Tessler Shie Mannor

Published in: CoRR (2019)

Keyphrases

inverse reinforcement learning
reward function
markov decision processes
bayesian nonparametric
partially observable environments
reinforcement learning
state space
partially observable
preference elicitation
reinforcement learning algorithms
multiple agents
optimal policy
markov decision process
markov decision problems
transition probabilities
machine learning
optimal solution
multi agent
temporal difference
average cost
learning agent
finite state
planning under uncertainty
search algorithm