Login / Signup
Optimistic Policy Optimization with Bandit Feedback.
Yonathan Efroni
Lior Shani
Aviv Rosenberg
Shie Mannor
Published in:
CoRR (2020)
Keyphrases
</>
global optimization
optimization problems
data sets
information retrieval
active learning
relevance feedback
optimization process
lower bound
optimization algorithm
constrained optimization