Similarities between policy gradient methods (PGM) in Reinforcement learning (RL) and supervised learning (SL).
Eric BenhamouPublished in: CoRR (2019)
Keyphrases
- reinforcement learning
- policy gradient methods
- supervised learning
- actor critic
- natural actor critic
- policy gradient
- function approximation
- reinforcement learning algorithms
- temporal difference
- reinforcement learning methods
- function approximators
- reinforcement learning problems
- rl algorithms
- model free
- learning algorithm
- state space
- learning problems
- machine learning
- approximate dynamic programming
- dynamic programming
- control problems
- action selection
- optimal policy
- learning process
- learning capabilities
- labeled data
- learning tasks
- markov decision processes
- transfer learning
- temporal difference learning
- partially observable
- action space
- robot arm
- average reward
- markov decision problems
- multi agent systems
- machine learning algorithms