A Provably-Efficient Model-Free Algorithm for Constrained Markov Decision Processes.
Honghao WeiXin LiuLei YingPublished in: CoRR (2021)
Keyphrases
- model free
- markov decision processes
- policy iteration
- reinforcement learning
- average reward
- dynamic programming
- model based reinforcement learning
- np hard
- learning algorithm
- reinforcement learning algorithms
- state space
- optimal policy
- optimal solution
- objective function
- search space
- risk sensitive
- state abstraction
- search algorithm
- fixed point
- finite state
- active learning
- average cost
- action space
- policy evaluation
- neural network