Learning Constrained Markov Decision Processes With Non-stationary Rewards and Constraints.
Francesco Emanuele StradiAnna LunghiMatteo CastiglioniAlberto MarchesiNicola GattiPublished in: CoRR (2024)
Keyphrases
- markov decision processes
- non stationary
- reinforcement learning
- partially observable
- state space
- dynamic programming
- stochastic games
- model based reinforcement learning
- optimal policy
- learning algorithm
- real time dynamic programming
- decision theoretic planning
- finite horizon
- reinforcement learning algorithms
- finite state
- state abstraction
- transition matrices
- policy iteration
- average reward
- planning under uncertainty
- random fields
- total reward
- reachability analysis
- function approximation
- learning tasks