Login / Signup
Regularized Policies are Reward Robust.
Hisham Husain
Kamil Ciosek
Ryota Tomioka
Published in:
CoRR (2021)
Keyphrases
</>
partial occlusion
weighted least squares
half quadratic
search engine
reinforcement learning
least squares
optimal policy
database
data sets
objective function
multi agent
robust estimation