Bias-corrected Q-learning to control max-operator bias in Q-learning.
Donghun LeeBoris DefournyWarren B. PowellPublished in: ADPRL (2013)
Keyphrases
- reinforcement learning
- cooperative
- function approximation
- multi agent
- state space
- action selection
- learning algorithm
- control problems
- reinforcement learning algorithms
- stochastic approximation
- model free
- genetic algorithm
- control system
- multi agent reinforcement learning
- learning rate
- stochastic shortest path
- potential field
- traffic signal
- reinforcement learning methods
- variance reduction
- decision making
- multiscale
- learning tasks
- monte carlo
- optimal policy
- trade off