Faster saddle-point optimization for solving large-scale Markov decision processes.
Joan Bas-SerranoGergely NeuPublished in: L4DC (2020)
Keyphrases
- markov decision processes
- saddle point
- transition matrices
- variational inequalities
- state space
- penalty function
- primal dual
- dynamic programming
- reinforcement learning
- markov decision problems
- decision theoretic planning
- optimal policy
- policy iteration
- maximum margin
- stochastic shortest path
- numerical methods
- partially observable
- average reward
- linear programming problems
- interior point
- discrete space
- quadratic programming
- markov networks
- semidefinite programming
- global constraints
- sensitivity analysis
- structured prediction
- dirichlet distribution
- genetic algorithm