Login / Signup
Decentralized model-free reinforcement learning in stochastic games with average-reward objective.
Romain Cravic
Nicolas Gast
Bruno Gaujal
Published in:
CoRR (2023)
Keyphrases
</>
stochastic games
average reward
policy gradient
markov decision processes
long run
optimal policy
multi agent
policy iteration
reinforcement learning
nash equilibria
markov chain
state action
model free
repeated games
robust optimization
cooperative
finite state
decision makers
average cost
search algorithm