Stochastic Primal-Dual Q-Learning Algorithm For Discounted MDPs.
Donghwan LeeNiao HePublished in: ACC (2019)
Keyphrases
- primal dual
- markov decision processes
- learning algorithm
- linear programming
- reinforcement learning
- reinforcement learning algorithms
- average cost
- linear program
- optimal policy
- affine scaling
- finite horizon
- convex optimization
- linear programming problems
- dynamic programming
- convergence rate
- average reward
- infinite horizon
- interior point methods
- approximation algorithms
- simplex algorithm
- variational inequalities
- interior point algorithm
- finite state
- semidefinite programming
- interior point
- simplex method
- state space
- policy iteration
- markov decision problems
- learning rate
- algorithm for linear programming
- machine learning
- markov decision process
- discounted reward
- learning problems
- supervised learning
- pairwise
- duality gap
- learning tasks
- multiresolution
- saddle point