Upper Confidence Primal-Dual Optimization: Stochastically Constrained Markov Decision Processes with Adversarial Losses and Unknown Transitions.
Shuang QiuXiaohan WeiZhuoran YangJieping YeZhaoran WangPublished in: CoRR (2020)
Keyphrases
- markov decision processes
- primal dual
- saddle point
- linear programming
- linear program
- affine scaling
- dynamic programming
- convex optimization
- optimal policy
- finite state
- convergence rate
- policy iteration
- state space
- interior point
- reinforcement learning
- interior point methods
- approximation algorithms
- decision theoretic planning
- transition matrices
- semidefinite programming
- algorithm for linear programming
- partially observable
- average reward
- reactive planning
- penalty function
- reward function
- markov decision process
- average cost
- action sets
- stochastic shortest path
- long run
- np hard
- multi agent