Multiple constrained continuous-time Markov Decision Processes with expected discounted reward criteria.
Lanlan ZhangZhuo GaoPublished in: CAIBDA (2022)
Keyphrases
- markov decision processes
- discounted reward
- average reward
- state space
- policy iteration
- stationary policies
- optimal policy
- reinforcement learning
- state and action spaces
- optimality criterion
- finite state
- dynamic programming
- transition matrices
- planning under uncertainty
- decision theoretic planning
- average cost
- reinforcement learning algorithms
- markov chain
- partially observable
- decision processes
- heuristic search
- state abstraction
- markov decision process
- action space
- reward function
- infinite horizon
- decision makers
- machine learning