Logarithmic Regret for Reinforcement Learning with Linear Function Approximation.

Jiafan He Dongruo Zhou Quanquan Gu

Published in: ICML (2021)

Keyphrases

function approximation
reinforcement learning
function approximators
temporal difference learning algorithms
regret bounds
temporal difference
temporal difference learning
model free
mountain car
learning tasks
radial basis function
tile coding
state space
reinforcement learning algorithms
learning algorithm
state action space
reward function
multi agent
td learning
optimal control
data mining
supervised learning
neural network
learning problems
pattern recognition
actor critic