An online primal-dual method for discounted Markov decision processes.
Mengdi WangYichen ChenPublished in: CDC (2016)
Keyphrases
- markov decision processes
- dynamic programming
- optimal policy
- reinforcement learning
- primal dual
- finite state
- objective function
- linear programming
- infinite horizon
- markov decision process
- state space
- np hard
- dynamical systems
- computational complexity
- sensitivity analysis
- convergence rate
- machine learning
- policy iteration
- simplex algorithm
- affine scaling