Increasing Entropy to Boost Policy Gradient Performance on Personalization Tasks.

Andrew Starnes Anton Dereventsov Clayton G. Webster

Published in: CoRR (2023)

Keyphrases

policy gradient
search space
actor critic
neural network
optimization method