Online Policy Gradient for Model Free Learning of Linear Quadratic Regulators with √T Regret.

Asaf Cassel Tomer Koren

Published in: ICML (2021)

Keyphrases

reinforcement learning
model free
policy gradient
reinforcement learning algorithms
learning process
function approximation
supervised learning
learning problems
learning tasks
learning algorithm
markov decision processes
average reward
reinforcement learning methods
rl algorithms
dynamical systems
policy iteration
stochastic games
variance reduction