Login / Signup
Exploring Best Arm with Top Reward-Cost Ratio in Stochastic Bandits.
Zhida Qin
Xiaoying Gan
Jia Liu
Hongqiu Wu
Haiming Jin
Luoyi Fu
Published in:
INFOCOM (2020)
Keyphrases
</>
multi armed bandit
stochastic systems
multi armed bandit problems
reinforcement learning
monte carlo
stochastic optimization
total cost
failure rate
neural network
high cost
cost sensitive
stochastic models
bandit problems
cost reduction
markov decision processes
flow network
query processing
machine learning