Login / Signup
Low-Switching Policy Gradient with Exploration via Online Sensitivity Sampling.
Yunfan Li
Yiran Wang
Yu Cheng
Lin Yang
Published in:
ICML (2023)
Keyphrases
</>
neural network
policy gradient
parametric optimization
reinforcement learning
sample size
action selection
evolutionary algorithm
dynamic programming
upper bound
supervised learning
domain independent