Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies.
Ilyas FatkhullinAnas BarakatAnastasia KireevaNiao HePublished in: ICML (2023)
Keyphrases
- sample complexity
- policy gradient methods
- natural actor critic
- theoretical analysis
- learning problems
- upper bound
- lower bound
- policy gradient
- active learning
- generalization error
- robot arm
- learning algorithm
- supervised learning
- special case
- sample size
- training examples
- optimal policy
- function approximation
- machine learning algorithms
- monte carlo
- actor critic