Synthesis of Discounted-Reward Optimal Policies for Markov Decision Processes Under Linear Temporal Logic Specifications.
Krishna Chaitanya KalagarlaRahul JainPierluigi NuzzoPublished in: CoRR (2020)
Keyphrases
- markov decision processes
- discounted reward
- optimal policy
- average reward
- policy iteration
- finite horizon
- decision problems
- finite state
- reinforcement learning
- state space
- dynamic programming
- state and action spaces
- infinite horizon
- average cost
- planning under uncertainty
- temporal logic
- reinforcement learning algorithms
- semi markov decision processes
- decision processes
- long run
- multistage
- markov decision process
- partially observable
- optimality criterion
- action space
- function approximation
- sufficient conditions
- initial state
- partially observable markov decision processes
- model checking
- total reward
- markov chain