Inverse Reinforcement Learning with the Average Reward Criterion.
Feiyang WuJingyang KeAnqi WuPublished in: NeurIPS (2023)
Keyphrases
- average reward
- inverse reinforcement learning
- reward function
- optimality criterion
- markov decision processes
- optimal policy
- reinforcement learning
- state space
- state action
- reinforcement learning algorithms
- stochastic games
- long run
- model free
- temporal difference
- partially observable
- policy iteration
- preference elicitation
- markov decision process
- multiple agents
- transition probabilities
- finite state
- decision problems
- generative model
- markov chain
- state variables
- partially observable markov decision processes
- function approximation
- control system