Login / Signup
Novel single-loop policy iteration for linear zero-sum games.
Jianguo Zhao
Chunyu Yang
Weinan Gao
Ju H. Park
Published in:
Autom. (2024)
Keyphrases
</>
policy iteration
markov decision processes
linear approximation
model free
fixed point
reinforcement learning
sample path
least squares
markov decision process
policy evaluation
temporal difference
neural network
optimal solution
optimal policy
infinite horizon
average reward