Login / Signup
Secure Cumulative Reward Maximization in Linear Stochastic Bandits.
Radu Ciucanu
Anatole Delabrouille
Pascal Lafourcade
Marta Soare
Published in:
ProvSec (2020)
Keyphrases
</>
multi armed bandit
regret bounds
stochastic systems
neural network
reinforcement learning
linear systems
stochastic optimization
security issues
objective function
long run
security mechanisms
multi armed bandits