PP-PG: Combining Parameter Perturbation with Policy Gradient Methods for Effective and Efficient Explorations in Deep Reinforcement Learning.
Shilei LiMeng LiJiongming SuShaofei ChenZhimin YuanQing YePublished in: ACM Trans. Intell. Syst. Technol. (2021)