Login / Signup
Addressing Optimism Bias in Sequence Modeling for Reinforcement Learning.
Adam R. Villaflor
Zhe Huang
Swapnil Pande
John M. Dolan
Jeff Schneider
Published in:
ICML (2022)
Keyphrases
</>
reinforcement learning
function approximation
real time
information systems
active learning
dynamic programming
optimal policy
optimal control
modeling language
modeling method
reinforcement learning algorithms
temporal difference learning