Direct gradient-based reinforcement learning.

Jonathan Baxter Peter L. Bartlett

Published in: ISCAS (2000)

Keyphrases

reinforcement learning
model free
function approximation
optimal policy
robotic control
reinforcement learning algorithms
multi agent
markov decision processes
data sets
state space
learning algorithm
supervised learning
dynamic programming
evolutionary algorithm
image sequences
action selection
temporal difference
function approximators
temporal difference learning
autonomous learning
relational reinforcement learning
genetic algorithm