Linear Convergence for Natural Policy Gradient with Log-linear Policy Parametrization.
Carlo AlfanoPatrick RebeschiniPublished in: CoRR (2022)
Keyphrases
- policy gradient
- log linear
- actor critic
- model free reinforcement learning
- reinforcement learning
- function approximation
- gradient method
- function approximators
- reinforcement learning algorithms
- log linear models
- approximation methods
- convergence rate
- optimal control
- probabilistic modeling
- single agent
- discriminative training
- state action
- multi agent
- variance reduction
- average reward
- partially observable markov decision processes
- latent variables
- pairwise
- machine learning