Population-Guided Parallel Policy Search for Reinforcement Learning.
Whiyoung JungGiseung ParkYoungchul SungPublished in: ICLR (2020)
Keyphrases
- policy search
- reinforcement learning
- reinforcement learning algorithms
- continuous state
- dynamic programming
- function approximation
- reward function
- temporal difference
- state space
- continuous action
- optimal policy
- partially observable markov decision processes
- neural network
- policy gradient
- multi agent
- action selection
- function approximators
- reinforcement learning methods