Login / Signup
Regularized Off-Policy TD-Learning.
Bo Liu
Sridhar Mahadevan
Ji Liu
Published in:
NIPS (2012)
Keyphrases
</>
td learning
temporal difference
evaluation function
function approximation
reinforcement learning
reinforcement learning algorithms
objective function
least squares
monte carlo
genetic algorithm
model free
multi agent
artificial neural networks
cost function
semi supervised
policy evaluation