The Convergence of a Cooperation Markov Decision Process System.
Xiaoling MoDaoyun XuZufeng FuPublished in: Entropy (2020)
Keyphrases
- random walk
- markov decision process
- transition probabilities
- state space
- markov decision processes
- reinforcement learning
- optimal policy
- stationary policies
- cooperative
- infinite horizon
- finite horizon
- policy iteration
- temporal difference learning
- transition matrices
- initial state
- multi agent systems
- multi agent
- convergence rate
- average cost
- long run
- supply chain
- bayesian networks