Login / Signup
Mitigating Off-Policy Bias in Actor-Critic Methods with One-Step Q-learning: A Novel Correction Approach.
Baturay Saglam
Dogan Can Çiçek
Furkan Burak Mutlu
Suleyman S. Kozat
Published in:
Trans. Mach. Learn. Res. (2024)
Keyphrases
</>
reinforcement learning
multi agent
neural network
cost function
function approximation