Bias-corrected Q-learning to control max-operator bias in Q-learning.

Donghun Lee Boris Defourny Warren B. Powell

Published in: ADPRL (2013)

Keyphrases

reinforcement learning
cooperative
function approximation
multi agent
state space
action selection
learning algorithm
control problems
reinforcement learning algorithms
stochastic approximation
model free
genetic algorithm
control system
multi agent reinforcement learning
learning rate
stochastic shortest path
potential field
traffic signal
reinforcement learning methods
variance reduction
decision making
multiscale
learning tasks
monte carlo
optimal policy
trade off