Logarithmic Regret for Reinforcement Learning with Linear Function Approximation.

Jiafan He Dongruo Zhou Quanquan Gu

Published in: CoRR (2020)

Keyphrases

function approximation
reinforcement learning
function approximators
temporal difference learning algorithms
regret bounds
temporal difference learning
mountain car
temporal difference
state action space
tile coding
radial basis function
learning tasks
model free
reinforcement learning algorithms
learning algorithm
machine learning
state space
temporal difference methods
neural network
td learning
feature selection
reward function
learning problems
markov decision processes
optimal policy
learning process
multi agent