Login / Signup
Online Reinforcement Learning with Uncertain Episode Lengths.
Debmalya Mandal
Goran Radanovic
Jiarui Gan
Adish Singla
Rupak Majumdar
Published in:
AAAI (2023)
Keyphrases
</>
reinforcement learning
online learning
decision making
function approximation
dynamic programming
state space
optimal policy
markov decision processes
reinforcement learning algorithms
balancing exploration and exploitation
neural network
information retrieval
email
learning problems