Login / Signup
Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation.
Ziyang Tang
Yihao Feng
Lihong Li
Dengyong Zhou
Qiang Liu
Published in:
CoRR (2019)
Keyphrases
</>
infinite horizon
finite horizon
optimal policy
optimal control
long run
dynamic programming
markov decision processes
stochastic demand
production planning
single item
markov decision process
state space
partially observable
average cost
fixed cost
dec pomdps