Login / Signup
Exploiting Similarity Information in Reinforcement Learning - Similarity Models for Multi-Armed Bandits and MDPs.
Ronald Ortner
Published in:
ICAART (1) (2010)
Keyphrases
</>
reinforcement learning
similarity measure
function approximation
probabilistic model
multi armed bandits
active learning
dynamic programming
maximum likelihood
optimal policy
euclidean distance
markov decision processes