Login / Signup
Combing Policy Evaluation and Policy Improvement in a Unified f-Divergence Framework.
Chen Gong
Qiang He
Yunpeng Bai
Xiaoyu Chen
Xinwen Hou
Yu Liu
Guoliang Fan
Published in:
CoRR (2021)
Keyphrases
</>
policy evaluation
least squares
optimal policy
learning algorithm
model free