Login / Signup
Multi-policy posterior sampling for restless Markov bandits.
Suleman Alnatheer
Hong Man
Published in:
GlobalSIP (2014)
Keyphrases
</>
semi markov
markov chain monte carlo
markov chain
optimal policy
optimal control
probabilistic model
probability distribution
sample size
posterior probability
posterior distribution
random sampling
sampling algorithm
conditional random fields
markov model
infinite horizon
multi armed bandit problems