Policy Gradient Methods for Reinforcement Learning with Function Approximation and Action-Dependent Baselines.
Philip S. ThomasEmma BrunskillPublished in: CoRR (2017)
Keyphrases
- function approximation
- natural actor critic
- reinforcement learning
- policy gradient methods
- function approximators
- policy gradient
- actor critic
- state action
- temporal difference
- action selection
- temporal difference learning
- reinforcement learning algorithms
- reinforcement learning problems
- action space
- radial basis function
- model free
- reinforcement learning methods
- learning tasks
- approximation methods
- state space
- learning problems
- policy search
- markov decision processes
- machine learning
- transfer learning