Q-Learning for Feedback Nash Strategy of Finite-Horizon Nonzero-Sum Difference Games.
Zhaorong ZhangJuanjuan XuMinyue FuPublished in: IEEE Trans. Cybern. (2022)
Keyphrases
- finite horizon
- optimal policy
- infinite horizon
- nash equilibria
- nash equilibrium
- optimal stopping
- markov decision processes
- decision problems
- game theory
- game theoretic
- stochastic games
- reinforcement learning
- optimal strategy
- state space
- inventory control
- mixed strategy
- cooperative game
- single product
- dynamic programming
- inventory models
- cooperative
- markov decision process
- multistage
- multi agent
- long run
- lot size
- finite state
- average cost
- objective function
- equilibrium strategies
- yield management
- inventory level
- machine learning
- inventory policy
- solution concepts
- control policies
- initial state
- search algorithm
- sufficient conditions
- non stationary