Robust Reward Design for Markov Decision Processes.
Shuo WuHaoxiang MaJie FuShuo HanPublished in: CoRR (2024)
Keyphrases
- markov decision processes
- reinforcement learning
- optimal policy
- average reward
- state space
- reward function
- dynamic programming
- finite state
- decision theoretic planning
- policy iteration
- transition matrices
- markov decision process
- average cost
- long run
- action space
- partially observable
- action sets
- decision processes
- planning under uncertainty
- risk sensitive
- discounted reward
- sufficient conditions