Login / Signup
Multi-Armed Bandit Strategies for Non-Stationary Reward Distributions and Delayed Feedback Processes.
Larkin Liu
Richard Downe
Joshua Reid
Published in:
CoRR (2019)
Keyphrases
</>
non stationary
multi armed bandit
delayed feedback
reinforcement learning
multi armed bandits
decentralized decision making
multi agent
adaptive algorithms
probability distribution
mutual information
empirical mode decomposition