Beyond discounted returns: Robust Markov decision processes with average and Blackwell optimality.
Julien Grand-ClémentMarek PetrikNicolas VieillePublished in: CoRR (2023)
Keyphrases
- markov decision processes
- average cost
- average reward
- optimal policy
- finite state
- state space
- stationary policies
- finite horizon
- infinite horizon
- transition matrices
- dynamic programming
- discounted reward
- policy iteration
- reinforcement learning
- long run
- markov decision process
- planning under uncertainty
- optimal control
- decision processes
- reinforcement learning algorithms
- action space
- decision theoretic planning
- finite number
- linear program
- state and action spaces
- action sets
- model based reinforcement learning
- real time dynamic programming
- risk sensitive
- partially observable
- total cost
- multistage
- real valued
- semi markov decision processes
- linear programming
- optimal solution