Provable Reset-free Reinforcement Learning by No-Regret Reduction.

Hoai-An Nguyen Ching-An Cheng

Published in: ICML (2023)

Keyphrases

reinforcement learning
reward function
online learning
lower bound
total reward
state space
reinforcement learning algorithms
worst case
multi agent
function approximation
binary classification
model free
temporal difference
reduction method
confidence bounds
regret minimization
machine learning
partially observable
learning algorithm
regret bounds
expert advice
multi agent reinforcement learning
multi armed bandit
markov decision processes
least squares