Online Policy Optimization for Robust MDP.
Jing DongJingwei LiBaoxiang WangJingzhao ZhangPublished in: CoRR (2022)
Keyphrases
- optimal policy
- markov decision process
- markov decision processes
- state space
- online learning
- global optimization
- markov decision problems
- robust optimization
- optimization process
- learning algorithm
- reinforcement learning
- partially observable
- simultaneous optimization
- real time
- constrained optimization
- optimization method
- linear program
- optimization algorithm
- linear programming