Two-Timescale Q-Learning with Function Approximation in Zero-Sum Stochastic Games.

Zaiwei Chen Kaiqing Zhang Eric Mazumdar Asuman E. Ozdaglar Adam Wierman

Published in: CoRR (2023)

Keyphrases

stochastic games
function approximation
reinforcement learning algorithms
reinforcement learning
model free
temporal difference
markov decision processes
temporal difference learning
average reward
multiagent reinforcement learning
nash equilibria
multi agent
function approximators
state action
learning agent
learning automata
td learning
nash equilibrium
radial basis function
single agent
repeated games
state space
robust optimization
genetic algorithm
active learning
neural network
policy iteration
dynamic environments
temporal difference methods