Toward Optimal Policy Population Growth in Two-Player Zero-Sum Games.
Stephen Marcus McAleerJB LanierKevin A. WangPierre BaldiTuomas SandholmRoy FoxPublished in: ICLR (2024)
Keyphrases
- optimal policy
- optimal strategy
- decision problems
- perfect information
- imperfect information
- markov decision processes
- dynamic programming
- reinforcement learning
- finite horizon
- state space
- reinforcement learning algorithms
- long run
- multistage
- game theoretic
- infinite horizon
- state dependent
- markov decision process
- bayesian reinforcement learning
- reward function
- sufficient conditions
- control policies
- single agent
- policy iteration
- expected cost
- average cost
- learning algorithm
- game playing
- monte carlo
- inventory control