Adaptive Rollout Length for Model-Based RL Using Model-Free Deep RL.

Abhinav Bhatia Philip S. Thomas Shlomo Zilberstein

Published in: CoRR (2022)

Keyphrases

model free
reinforcement learning
reinforcement learning algorithms
rl algorithms
function approximation
temporal difference
policy iteration
policy evaluation
machine learning
state space
markov decision processes
temporal difference learning