Login / Signup
Online Stochastic Shortest Path with Bandit Feedback and Unknown Transition Function.
Aviv Rosenberg
Yishay Mansour
Published in:
NeurIPS (2019)
Keyphrases
</>
stochastic shortest path
online learning
neural network
decision making
state space
markov chain