Login / Signup
A Configurable off-Policy Evaluation with Key State-Based Bias Constraints in AI Reinforcement Learning.
Shuoru Wang
Jiqiang Liu
Tong Chen
He Li
Wenjia Niu
Endong Tong
Long Li
Minglu Song
Published in:
SocialSec (2020)
Keyphrases
</>
policy evaluation
reinforcement learning
variance reduction
least squares
state space
temporal difference
function approximation
model free
markov decision processes
monte carlo
policy iteration
learning algorithm
machine learning
optimal policy
semi parametric
state action
cost function