The Stochastic Evolutionary Dynamics of Softmax Policy Gradient in Games.

Chin-wing Leung Shuyue Hu Ho-fung Leung

Published in: AAMAS (2024)

Keyphrases

policy gradient
model free reinforcement learning
actor critic
reinforcement learning
game theory
gradient method
monte carlo
optimal control
function approximation
nash equilibria
reinforcement learning algorithms
learning automata
average reward
nash equilibrium
variance reduction
search algorithm
game theoretic
single agent
game playing
approximation methods
reinforcement learning methods
optimal policy
dynamic programming