Inverse reinforcement learning in contextual MDPs.
Stav BelogolovskyPhilip KorsunskyShie MannorChen TesslerTom ZahavyPublished in: Mach. Learn. (2021)
Keyphrases
- inverse reinforcement learning
- reward function
- markov decision processes
- bayesian nonparametric
- partially observable environments
- reinforcement learning
- state space
- optimal policy
- partially observable
- markov decision process
- reinforcement learning algorithms
- preference elicitation
- multiple agents
- simple examples
- markov decision problems
- finite horizon
- transition probabilities
- approximate dynamic programming
- average cost
- policy iteration
- machine learning
- search algorithm
- control policies
- generative model
- infinite horizon
- finite state
- state variables
- mixture model