Login / Signup
Gain-based Exploration: From Multi-armed Bandits to Partially Observable Environments.
Bailu Si
J. Michael Herrmann
Klaus Pawelzik
Published in:
ICNC (1) (2007)
Keyphrases
</>
multi armed bandits
partially observable environments
inverse reinforcement learning
bandit problems
partially observable
reinforcement learning
reinforcement learning algorithms
multi armed bandit
bayesian networks
decision making
objective function
partially observable markov decision processes