Login / Signup

Off-policy evaluation for tabular reinforcement learning with synthetic trajectories.

Weiwei WangYuqiang LiXianyi Wu
Published in: Stat. Comput. (2024)
Keyphrases