Efficient Planning in Large MDPs with Weak Linear Function Approximation.

Roshan Shariff Csaba Szepesvári

Published in: CoRR (2020)

Keyphrases

function approximation
reinforcement learning
temporal difference learning algorithms
reinforcement learning problems
function approximators
markov decision processes
temporal difference learning
learning tasks
temporal difference
state space
neural network
reinforcement learning algorithms
dynamic programming
radial basis function
markov decision problems
temporal difference methods
model free
action selection
policy gradient
initial state
text classification
genetic algorithm