Login / Signup
Reinforcement Learning: a Comparison of UCB Versus Alternative Adaptive Policies.
Wesley Cowan
Michael N. Katehakis
Daniel Pirutinsky
Published in:
CoRR (2019)
Keyphrases
</>
reinforcement learning
optimal policy
policy search
learning algorithm
control policies
markov decision processes
markov decision process
machine learning
optimal control
reward function
reinforcement learning algorithms
multi armed bandit
policy gradient methods