Balancing Two-Player Stochastic Games with Soft Q-Learning.
Jordi Grau-MoyaFelix LeibfriedHaitham Bou-AmmarPublished in: CoRR (2018)
Keyphrases
- stochastic games
- reinforcement learning algorithms
- multi agent reinforcement learning
- state action
- single agent
- reinforcement learning
- multi agent
- nash equilibria
- markov decision processes
- state space
- model free
- repeated games
- learning agent
- multiagent reinforcement learning
- rl algorithms
- function approximation
- nash equilibrium
- temporal difference
- learning algorithm
- imperfect information
- multiple agents
- optimal policy
- learning automata
- reward function
- cooperative
- average reward
- dynamic environments
- learning agents
- decision problems
- policy iteration
- markov decision process
- evaluation function
- np hard
- multi agent systems
- finite state