APR-ES: Adaptive Penalty-Reward Based Evolution Strategy for Deep Reinforcement Learning.
Dongdong WangSiyang LuXiang WeiMingquan WangYandong LiLiqiang WangPublished in: SmartWorld/UIC/ScalCom/DigitalTwin/PriComp/Meta (2022)
Keyphrases
- reinforcement learning
- evolution strategy
- evolutionary algorithm
- function approximation
- differential evolution
- state space
- cma es
- optimal policy
- particle swarm optimization algorithm
- eligibility traces
- reinforcement learning algorithms
- reward function
- model free
- machine learning
- multi agent
- learning process
- learning algorithm
- temporal difference
- markov decision processes
- partially observable environments
- reward shaping
- action selection
- optimal control
- multi objective
- objective function
- learning capabilities
- markov decision process
- policy iteration
- candidate solutions
- policy gradient
- global search
- neural network
- optimization methods