Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies.
Ilyas FatkhullinAnas BarakatAnastasya KireevaNiao HePublished in: CoRR (2023)
Keyphrases
- sample complexity
- policy gradient methods
- natural actor critic
- theoretical analysis
- learning algorithm
- active learning
- special case
- policy gradient
- learning problems
- robot arm
- upper bound
- lower bound
- generalization error
- training examples
- sample size
- supervised learning
- monte carlo
- learning tasks
- optimal policy
- support vector
- semi supervised
- sufficient conditions
- state space
- dynamic programming
- np hard
- reinforcement learning