Login / Signup
Adversarially Trained Actor Critic for offline CMDPs.
Honghao Wei
Xiyue Peng
Xin Liu
Arnob Ghosh
Published in:
CoRR (2024)
Keyphrases
</>
actor critic
reinforcement learning
optimal control
policy gradient
temporal difference
approximate dynamic programming
neuro fuzzy
gradient method
function approximation
reinforcement learning algorithms
training set
policy iteration
fuzzy sets
fixed point