Login / Signup

Instance-Dependent Near-Optimal Policy Identification in Linear MDPs via Online Experiment Design.

Andrew WagenmakerKevin Jamieson
Published in: CoRR (2022)
Keyphrases