A novel double-mGBDT-based Q-learning.

Qiming Fu Shuai Ma Dawei Tian Jianping Chen Zhen Gao Shan Zhong

Published in: Int. J. Model. Identif. Control. (2021)

Keyphrases

reinforcement learning
cooperative
multi agent
function approximation
state space
learning algorithm
optimal policy
reinforcement learning algorithms
action selection
stochastic approximation
learning rate
model free
multi agent reinforcement learning
potential field
policy iteration
real time
temporal difference learning
continuous state and action spaces