Login / Signup
Sample-Efficient Reinforcement Learning via Conservative Model-Based Actor-Critic.
Zhihai Wang
Jie Wang
Qi Zhou
Bin Li
Houqiang Li
Published in:
CoRR (2021)
Keyphrases
</>
reinforcement learning
actor critic
function approximation
model free
temporal difference
policy gradient
reinforcement learning algorithms
policy iteration
optimal control
approximate dynamic programming
machine learning
sample size
neuro fuzzy
decision making
least squares
average reward