Rate-Optimal Policy Optimization for Linear Markov Decision Processes.
Uri ShermanAlon CohenTomer KorenYishay MansourPublished in: CoRR (2023)
Keyphrases
- optimal policy
- markov decision processes
- state space
- reinforcement learning
- decision problems
- finite state
- finite horizon
- dynamic programming
- infinite horizon
- long run
- average reward
- policy iteration
- sufficient conditions
- average cost
- markov decision process
- state dependent
- decision processes
- initial state
- multistage
- partially observable markov decision processes
- reinforcement learning algorithms
- planning under uncertainty
- partially observable
- action space
- control policies
- machine learning
- objective function