Online Policy Gradient for Model Free Learning of Linear Quadratic Regulators with √T Regret.

Asaf Cassel Tomer Koren

Published in: CoRR (2021)

Keyphrases

model free
reinforcement learning
policy gradient
learning algorithm
learning process
rl algorithms
learning tasks
reinforcement learning methods
reinforcement learning algorithms
neural network
active learning
dynamic programming
real valued
optimal control