• search
    search
  • reviewers
    reviewers
  • feeds
    feeds
  • assignments
    assignments
  • settings
  • logout

Policy Iteration Q-Learning for Data-Based Two-Player Zero-Sum Game of Linear Discrete-Time Systems.

Biao LuoYin YangDerong Liu
Published in: IEEE Trans. Cybern. (2021)
Keyphrases
  • policy iteration
  • training data
  • machine learning
  • reinforcement learning
  • optimal policy
  • learning algorithm
  • multi agent
  • search space
  • graphical models
  • temporal difference