Logit-Q Learning in Markov Games.

Muhammed O. Sayin Onur Unlu

Published in: CoRR (2022)

Keyphrases

markov games
markov decision processes
reinforcement learning algorithms
multiagent reinforcement learning
reinforcement learning
markov decision process
control problems
state space
multiagent systems
cooperative
optimal policy
model free
multi agent
nash equilibrium
optimal stopping
temporal difference
policy iteration
finite state
stochastic games
function approximation
average cost
temporal difference learning
learning algorithm
action space
partially observable
reward function
infinite horizon