Proximal Policy Gradient Arborescence for Quality Diversity Reinforcement Learning.
Sumeet BatraBryon TjanakaMatthew C. FontaineAleksei PetrenkoStefanos NikolaidisGaurav S. SukhatmePublished in: CoRR (2023)
Keyphrases
- policy gradient
- reinforcement learning
- actor critic
- policy search
- function approximation
- reinforcement learning algorithms
- state space
- policy gradient methods
- function approximators
- model free reinforcement learning
- optimal policy
- model free
- temporal difference
- gradient method
- markov decision processes
- optimal control
- variance reduction
- neural network
- average reward
- supervised learning
- approximate dynamic programming
- evolutionary algorithm