Login / Signup
Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation.
Qiang Liu
Lihong Li
Ziyang Tang
Dengyong Zhou
Published in:
CoRR (2018)
Keyphrases
</>
infinite horizon
finite horizon
optimal control
long run
optimal policy
markov decision processes
production planning
dynamic programming
single item
average cost
stochastic demand
markov decision process
state space
partially observable
lead time
reinforcement learning
control system
holding cost