Login / Signup
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs.
Dirk van der Hoeven
Lukas Zierahn
Tal Lancewicki
Aviv Rosenberg
Nicolò Cesa-Bianchi
Published in:
COLT (2023)
Keyphrases
</>
markov decision processes
real time
data mining
image analysis
quantitative analysis
multi armed bandits
learning algorithm
information systems
data analysis
state space
optimal policy
partially observable
stochastic systems