Login / Signup
High-Dimensional Contextual Policy Search with Unknown Context Rewards using Bayesian Optimization.
Qing Feng
Benjamin Letham
Hongzi Mao
Eytan Bakshy
Published in:
NeurIPS (2020)
Keyphrases
</>
policy search
high dimensional
reinforcement learning
bayesian networks
reward function
optimal solution
feature space
state space
optimal policy
function approximation
decision theory
continuous state
monte carlo methods