Rates of Convergence of Performance Gradient Estimates Using Function Approximation and Bias in Reinforcement Learning.
Gregory Z. GrudicLyle H. UngarPublished in: NIPS (2001)
Keyphrases
- function approximation
- rates of convergence
- reinforcement learning
- policy gradient
- temporal difference
- temporal difference learning
- regression function
- model free
- function approximators
- learning tasks
- state space
- decision boundary
- radial basis function
- reinforcement learning algorithms
- learning algorithm
- expectation maximization
- markov decision processes
- gradient method
- learning experience
- supervised learning
- semi supervised
- data sets