Sign in

IDQL: Implicit Q-Learning as an Actor-Critic Method with Diffusion Policies.

Philippe Hansen-EstruchIlya KostrikovMichael JannerJakub Grudzien KubaSergey Levine
Published in: CoRR (2023)
Keyphrases
  • learning algorithm
  • dynamic programming
  • gradient method
  • reinforcement learning
  • support vector machine
  • mathematical model
  • monte carlo
  • convergence rate
  • function approximation
  • optimal control