Login / Signup
Adaptive Rollout Length for Model-Based RL Using Model-Free Deep RL.
Abhinav Bhatia
Philip S. Thomas
Shlomo Zilberstein
Published in:
CoRR (2022)
Keyphrases
</>
model free
reinforcement learning
reinforcement learning algorithms
rl algorithms
function approximation
temporal difference
policy iteration
policy evaluation
machine learning
state space
markov decision processes
temporal difference learning