Login / Signup
Policy Iteration Q-Learning for Data-Based Two-Player Zero-Sum Game of Linear Discrete-Time Systems.
Biao Luo
Yin Yang
Derong Liu
Published in:
IEEE Trans. Cybern. (2021)
Keyphrases
</>
policy iteration
training data
machine learning
reinforcement learning
optimal policy
learning algorithm
multi agent
search space
graphical models
temporal difference