Login / Signup
Bayesian Off-Policy Evaluation and Learning for Large Action Spaces.
Imad Aouali
Victor-Emmanuel Brunel
David Rohde
Anna Korba
Published in:
CoRR (2024)
Keyphrases
</>
reinforcement learning
learning algorithm
prior knowledge
policy evaluation
supervised learning
markov decision processes
bayesian networks
search algorithm
domain independent
learning tasks
action selection
temporal difference
statistical inference