Login / Signup
Stochastic Contextual Bandits with Long Horizon Rewards.
Yuzhen Qin
Yingcong Li
Fabio Pasqualetti
Maryam Fazel
Samet Oymak
Published in:
AAAI (2023)
Keyphrases
</>
multi armed bandits
stochastic systems
contextual information
bandit problems
reinforcement learning
context sensitive
stochastic model
stochastic optimization
monte carlo
stochastic models
multi armed bandit
regret bounds
stochastic nature
real world
markov decision processes