Inverse Reinforcement Learning in Contextual MDPs.
Philip KorsunskyStav BelogolovskyTom ZahavyChen TesslerShie MannorPublished in: CoRR (2019)
Keyphrases
- inverse reinforcement learning
- reward function
- markov decision processes
- bayesian nonparametric
- partially observable environments
- reinforcement learning
- state space
- partially observable
- preference elicitation
- reinforcement learning algorithms
- multiple agents
- optimal policy
- markov decision process
- markov decision problems
- transition probabilities
- machine learning
- optimal solution
- multi agent
- temporal difference
- average cost
- learning agent
- finite state
- planning under uncertainty
- search algorithm