Login / Signup

Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation.

Fengdi CheChenjun XiaoJincheng MeiBo DaiRamki GummadiOscar A RamirezChristopher K. HarrisA. Rupam MahmoodDale Schuurmans
Published in: CoRR (2024)
Keyphrases