Login / Signup
Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation.
Ziyang Tang
Yihao Feng
Lihong Li
Dengyong Zhou
Qiang Liu
Published in:
ICLR (2020)
Keyphrases
</>
infinite horizon
finite horizon
optimal control
long run
stochastic demand
dynamic programming
optimal policy
markov decision processes
partially observable
production planning
state space
average cost
lead time
fixed cost
machine learning
inventory policy