Fisher-Rao Gradient Flows of Linear Programs and State-Action Natural Policy Gradients.
Johannes MüllerSemih CayciGuido MontúfarPublished in: CoRR (2024)
Keyphrases
- linear program
- state action
- evaluation function
- linear programming
- reinforcement learning
- markov decision process
- average reward
- action space
- stochastic games
- objective function
- state transitions
- optimal solution
- markov decision processes
- average cost
- optimal policy
- reward function
- dynamic programming
- belief state
- function approximators
- action selection
- state space
- kernel matrix
- policy iteration