Login / Signup
OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation.
Jongmin Lee
Wonseok Jeon
Byung-Jun Lee
Joelle Pineau
Kee-Eung Kim
Published in:
ICML (2021)
Keyphrases
</>
stationary distribution
markov chain
state dependent
random walk
queueing networks
queue length
transition probabilities
product form
steady state
initial state
optimal policy
machine learning
queueing model
multistage
maximum likelihood
asymptotically optimal
higher order
neural network