Login / Signup
ACE : Off-Policy Actor-Critic with Causality-Aware Entropy Regularization.
Tianying Ji
Yongyuan Liang
Yan Zeng
Yu Luo
Guowei Xu
Jiawei Guo
Ruijie Zheng
Furong Huang
Fuchun Sun
Huazhe Xu
Published in:
CoRR (2024)
Keyphrases
</>
actor critic
convergence proof
reinforcement learning
approximate dynamic programming
optimal control
neuro fuzzy
temporal difference
policy gradient
gradient method
reinforcement learning algorithms
policy iteration
function approximation
markov decision processes
state space
neural network
model free