Login / Signup

Increasing Entropy to Boost Policy Gradient Performance on Personalization Tasks.

Andrew StarnesAnton DereventsovClayton G. Webster
Published in: CoRR (2023)
Keyphrases
  • policy gradient
  • search space
  • actor critic
  • neural network
  • optimization method