Successive Over-Relaxation ${Q}$ -Learning.

Chandramouli Kamanchi Raghuram Bharadwaj Diddigi Shalabh Bhatnagar

Published in: IEEE Control. Syst. Lett. (2020)

Keyphrases

reinforcement learning
function approximation
cooperative
multi agent
stochastic approximation
state space
learning algorithm
action selection
reinforcement learning algorithms
optimal policy
multi agent reinforcement learning
probabilistic relaxation
lognormal distribution
bucket brigade
iterative algorithms
temporal difference learning
potential field
convex relaxation
policy iteration
state action
fixed point
reward function
linear programming relaxation
learning rate
markov decision processes
artificial neural networks
objective function