Correlation minimizing replay memory in temporal-difference reinforcement learning.

Mirza Ramicic Andrea Bonarini

Published in: Neurocomputing (2020)

Keyphrases

temporal difference
reinforcement learning
function approximation
td learning
past experience
model free
temporal difference learning
evaluation function
reinforcement learning algorithms
function approximators
action selection
state space
policy evaluation
monte carlo
step size
policy iteration
temporal difference methods
supervised learning
actor critic
objective function
learning algorithm
multi agent
genetic algorithm
td methods
machine learning