Inverse Reinforcement Learning with the Average Reward Criterion.
Feiyang WuJingyang KeAnqi WuPublished in: CoRR (2023)
Keyphrases
- average reward
- inverse reinforcement learning
- reward function
- optimality criterion
- markov decision processes
- optimal policy
- reinforcement learning
- state space
- long run
- stochastic games
- reinforcement learning algorithms
- policy iteration
- preference elicitation
- state action
- partially observable
- multiple agents
- markov decision process
- state variables
- temporal difference
- model free
- dynamic programming
- probabilistic model
- generative model
- transition probabilities
- optimal control
- step size
- control system
- decision making