Login / Signup

Policy Iteration Q-Learning for Data-Based Two-Player Zero-Sum Game of Linear Discrete-Time Systems.

Biao LuoYin YangDerong Liu
Published in: IEEE Trans. Cybern. (2021)
Keyphrases
  • policy iteration
  • training data
  • machine learning
  • reinforcement learning
  • optimal policy
  • learning algorithm
  • multi agent
  • search space
  • graphical models
  • temporal difference