Distributed Asynchronous Policy Iteration for Sequential Zero-Sum Games and Minimax Control.
Dimitri P. BertsekasPublished in: CoRR (2021)
Keyphrases
- policy iteration
- markov decision processes
- optimal control
- model free
- fixed point
- control system
- optimal policy
- least squares
- sample path
- artificial neural networks
- multi agent
- imperfect information
- decision making
- markov decision process
- policy evaluation
- temporal difference
- control strategy
- infinite horizon
- optimal strategy
- finite state
- neural network
- evaluation function
- convergence rate
- learning algorithm