Accelerated and instance-optimal policy evaluation with linear function approximation.

Tianjiao Li Guanghui Lan Ashwin Pananjady

Published in: CoRR (2021)

Keyphrases

function approximation
policy evaluation
temporal difference
reinforcement learning
function approximators
model free
td learning
least squares
radial basis function
learning tasks
semi parametric
monte carlo
dynamic programming
markov decision processes
optimal control
state space
policy iteration
artificial neural networks
optimal solution
learning algorithm
data mining
markov chain
step size