Login / Signup
DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm.
Yunhao Tang
Tadashi Kozuno
Mark Rowland
Anna Harutyunyan
Rémi Munos
Bernardo Ávila Pires
Michal Valko
Published in:
CoRR (2023)
Keyphrases
</>
multi step
learning algorithm
convergence rate
objective function
search space
optimization algorithm
machine learning
lower bound
semi supervised
text classification
gradient method
approximate dynamic programming
actor critic