An Improved Minimax-Q Algorithm Based on Generalized Policy Iteration to Solve a Chaser-Invader Game.

Minsong Liu Yuanheng Zhu Dongbin Zhao

Published in: IJCNN (2020)

Keyphrases

policy iteration
worst case
np hard
objective function
learning algorithm
stochastic approximation
optimal solution
least squares
mathematical model
game tree
reinforcement learning
multi agent
search space
probabilistic model
model free
temporal difference