A Cross Entropy based Stochastic Approximation Algorithm for Reinforcement Learning with Linear Function Approximation.

Ajin George Joseph Shalabh Bhatnagar

Published in: CoRR (2016)

Keyphrases

stochastic approximation
function approximation
reinforcement learning
function approximators
model free
temporal difference learning
dynamic programming
learning algorithm
temporal difference learning algorithms
monte carlo
policy iteration
linear programming
state space
neural network
actor critic
policy evaluation
objective function
optimal solution
single agent
reinforcement learning algorithms
temporal difference
search space
multi agent
radial basis function
support vector machine svm
td learning
least squares