Thompson Sampling for Learning Parameterized MDPs.

Aditya Gopalan Shie Mannor

Published in: CoRR (2014)

Keyphrases

reinforcement learning
learning algorithm
learning process
learning problems
stochastic domains
neural network
multi agent systems
active learning
least squares
mobile learning
learning analytics
random sampling
decision theoretic planning